Multi-site MRI datasets are crucial for big data research. However, neuroimaging studies must face the batch effect. Here, we propose an approach that uses the predictive probabilities provided by Gaussian processes (GPs) to harmonize clinical-based studies. A multi-site dataset of 216 Parkinson’s disease (PD) patients and 87 healthy subjects (HS) was used. We performed a site GP classification using MRI data. The outcomes estimated from this classification, redefined like Weighted HARMonization PArameters (WHARMPA), were used as regressors in two different clinical studies: A PD versus HS machine learning classification using GP, and a VBM comparison (FWE-p < .05, k = 100). Same studies were also conducted using conventional Boolean site covariates, and without information about site belonging. The results from site GP classification provided high scores, balanced accuracy (BAC) was 98.39% for grey matter images. PD versus HS classification performed better when the WHARMPA were used to harmonize (BAC = 78.60%; AUC = 0.90) than when using the Boolean site information (BAC = 56.31%; AUC = 0.71) and without it (BAC = 57.22%; AUC = 0.73). The VBM analysis harmonized using WHARMPA provided larger and more statistically robust clusters in regions previously reported in PD than when the Boolean site covariates or no corrections were added to the model. In conclusion, WHARMPA might encode global site-effects quantitatively and allow the harmonization of data. This method is user-friendly and provides a powerful solution, without complex implementations, to clean the analyses by removing variability associated with the differences between sites.
© 2022 The Authors. Human Brain Mapping published by Wiley Periodicals LLC.

Author