Comprehensive Evaluation of Pipelines for Classification of Psychiatric Disorders Using Multi-Site Resting-State fMRI Datasets

Background Introduction

The field of psychiatry has long relied on symptoms and medical interviews for diagnosis, lacking objective biomarkers. Resting-state functional magnetic resonance imaging (rs-fMRI) is widely believed to reveal characteristic patterns of brain structure and function, thereby providing potential classification markers for the diagnosis of mental disorders. However, due to the diversity of analysis pipelines, no widely accepted biomarkers have been established. The choice of different analysis pipelines significantly impacts diagnostic and generalization performance, but few studies have systematically explored ideal pipelines. Therefore, this study aims to comprehensively evaluate analysis pipelines for major depressive disorder (MDD) classification biomarkers using a large-scale, multi-site rs-fMRI dataset, with the goal of providing a standardized process for the diagnosis of mental disorders.

Source of the Paper

Comprehensive Evaluation of Psychiatric Disorder Classification Pipelines Using Multi-Site Resting-State fMRI Datasets

This paper was collaboratively completed by a team from multiple research institutions in Japan. The main authors include Yuji Takahara, Yuto Kashiwagi, Tomoki Tokuda, and others, with the research team hailing from institutions such as the Advanced Telecommunications Research Institute International, Shionogi & Co., Ltd., and The University of Tokyo. The paper was published online on February 28, 2025, in the journal Neural Networks, with the DOI 10.1016/j.neunet.2025.107335.

Research Process

1. Datasets and Preprocessing

The study utilized three datasets: - Dataset I: Included 713 participants (564 healthy controls and 149 MDD patients) from four sites, with data collected using a unified protocol. - Dataset II: Included 449 participants (264 healthy controls and 185 MDD patients) from four independent sites, with data collected using heterogeneous protocols. - Dataset III: Included 231 participants (125 autism spectrum disorder patients and 106 schizophrenia patients), used to validate the generalization capability of the pipelines.

Data preprocessing was performed using the fMRIprep tool, including steps such as slice-timing correction, motion correction, co-registration, distortion correction, T1-weighted image segmentation, and normalization.

2. Construction of Analysis Pipelines

The study explored combinations of options in four sub-processes: - Parcellation: Included six methods, such as Glasser surface parcellation, Shen atlas, and data-driven dictionary learning. - Functional Connectivity (FC) Estimation: Included four methods, such as Pearson’s full correlation, tangent space covariance, partial correlation, and distance correlation. - Site-Difference Harmonization: Included three methods, such as traveling-subject harmonization, Combat harmonization, and no correction. - Machine Learning Methods: Included five methods, such as Lasso, sparse logistic regression, Ridge, support vector machine (SVM), and random forest.

By combining these options, a total of 360 different MDD classification biomarkers were constructed.

3. Construction and Validation of Classification Biomarkers

Using Dataset I as the discovery dataset, MDD classification biomarkers were constructed and evaluated through 10-fold nested cross-validation (CV). Subsequently, the biomarkers were applied to Dataset II for independent validation. To exclude dataset dependency, the roles of Dataset I and Dataset II were swapped, and the above process was repeated.

4. Performance Evaluation

Evaluation metrics included the area under the curve (AUC), accuracy, sensitivity, specificity, and Matthews correlation coefficient (MCC). The study also defined two custom metrics, “composite score” and “instability,” to comprehensively assess biomarker performance.

5. Biomarker Similarity Analysis

The study analyzed the classification results and weight similarity of the top 10 biomarkers to verify their consistency. By mapping functional connectivity to Yeo et al.’s brain networks, the network utilization rates of important functional connections were compared across different biomarkers.

6. Application to Other Mental Disorders

The top 10 pipelines were applied to datasets of autism spectrum disorder (ASD) and schizophrenia (SCZ) to validate their classification performance in these disorders.

Main Results

1. Comparison of Classification Performance

The study found that Glasser surface parcellation and data-driven dictionary learning performed best in parcellation, while Pearson’s full correlation and tangent space covariance excelled in functional connectivity estimation. In site-difference harmonization, traveling-subject harmonization and no correction performed similarly, while Combat harmonization showed significantly lower performance. Among machine learning methods, non-sparse methods (such as Ridge and SVM) outperformed sparse methods.

2. Dataset Role Swap Validation

By swapping the roles of Dataset I and Dataset II, the study validated the generalization capability of the pipelines. The results showed that Glasser surface parcellation and Pearson’s full correlation methods remained stable under different dataset roles.

3. Biomarker Similarity

The top 10 biomarkers showed high consistency in classification results, and the weight patterns were highly similar among eight biomarkers. The two biomarkers using data-driven dictionary parcellation and tangent space covariance showed lower weight similarity to others, indicating that they might capture different MDD characteristics.

4. Application to Other Mental Disorders

In the classification of ASD and SCZ, eight of the top 10 pipelines demonstrated sufficient classification performance, indicating their broad application potential in other mental disorders.

Conclusions and Significance

This study comprehensively evaluated analysis pipelines for MDD classification biomarkers using a large-scale, multi-site rs-fMRI dataset and identified the best options in parcellation, functional connectivity estimation, site-difference harmonization, and machine learning methods. The results showed that Glasser surface parcellation, Pearson’s full correlation, no correction, and non-sparse machine learning methods excelled in constructing classification biomarkers with high generalization performance. Additionally, these pipelines demonstrated good performance in the classification of ASD and SCZ, providing a standardized process for the diagnosis of mental disorders.

Research Highlights

  1. Systematic Evaluation: This study is the first to systematically evaluate the performance of multiple advanced methods (such as Glasser surface parcellation, distance correlation, and site-difference harmonization) within a unified framework.
  2. Generalization Performance Validation: By swapping dataset roles, the study validated the generalization capability of the pipelines, ensuring their robustness across different datasets.
  3. Multi-Disorder Application: The study applied the top 10 pipelines to ASD and SCZ, demonstrating their broad application potential in multiple mental disorders.

Other Valuable Information

The data and code from this study are available through the DECNEF Project Brain Data Repository (https://bicr.atr.jp/decnefpro/data), providing valuable research resources for other researchers.