Validity of European-Centric Cardiometabolic Polygenic Scores in Multi-Ancestry Populations
Effectiveness of European-derived Cardiometabolic Polygenic Scores in Multi-ancestral Populations
In recent years, Polygenic Scores (PGS) have received widespread attention as a tool for assessing individual genetic risk. However, most existing PGS are based on Genome-Wide Association Studies (GWAS) data from white European populations. This has led to questions about the validity of these PGS in non-European populations. This paper aims to evaluate the performance of PGS across different ethnic groups, particularly South Asian and African Caribbean populations, to explore health inequality issues.
Background
Polygenic Scores (PGS) provide an assessment of an individual’s genetic risk for a specific disease. However, the vast majority of current PGS are derived from data obtained from GWAS of European populations. These characteristics mean that the validity of PGS in other ethnic groups has yet to be confirmed, especially as cardiometabolic diseases have a greater impact on non-European populations. Therefore, to address these health inequalities, the authors of this paper sought to evaluate the performance of PGS in cardiometabolic disease-related traits using cross-ethnic data from the UK Biobank.
Paper Source Introduction
This paper was written by Constantin-Cristian Topriceanu, Nish Chaturvedi, Rohini Mathur, and Victoria Garfield, published in the European Journal of Human Genetics in 2024.
Research Process
Study Subjects and Data Sources
The study used data from the UK Biobank, a prospective cohort study of over 500,000 UK adults aged 40-69. The data included participants’ self-reported ethnicity, genetic information, health outcomes, and imaging data. According to the 2001 UK statistical standard, self-reported ethnicity data were 94.4% White European, 0.2% South Asian, 0.2% African Caribbean, and 5.2% other/unknown ethnicity.
Construction of Polygenic Scores
The study used standard and enhanced PGS developed by Thompson et al. Standard PGS only included external GWAS data, while enhanced PGS included both external and UK Biobank’s own GWAS data. All PGS were based on specific cardiometabolic traits, such as Type 1 Diabetes (T1DM), Type 2 Diabetes (T2DM), Glycated Hemoglobin (HbA1c), Body Mass Index (BMI), Hypertension, Coronary Artery Disease (CAD), Ischaemic Stroke, Cardiovascular Disease (CVD), High-Density Lipoprotein (HDL), Low-Density Lipoprotein (LDL), Total Cholesterol, and Triglycerides.
Result Analysis and Validation
All results were initially analyzed using logistic regression models for binary outcomes, such as predicting cardiovascular disease, with model performance evaluated using ROC curves and AUC values. For continuous variables, GLM models with Gamma distribution were used for regression analysis.
To ensure accuracy, models were adjusted for age, sex, and socioeconomic status in addition to processing raw data. Further sensitivity analyses adjusted for the effects of diabetes medications and lipid-lowering drugs to ensure direct estimates of PGS impact on metabolic traits.
Main Research Findings
Predictive Performance of Polygenic Scores Across Ethnicities
Type 1 Diabetes (T1DM): PGS performed best in White Europeans (OR=3.09, AUC=0.84), while performance was poorer in South Asians (OR=1.52, AUC=0.63) and African Caribbeans (OR=1.40, AUC=0.50).
Type 2 Diabetes (T2DM): White Europeans (OR=2.48, AUC=0.80) outperformed South Asians (OR=2.05, AUC=0.76) and African Caribbeans (OR=1.51, AUC=0.73) in T2DM PGS performance.
Glycated Hemoglobin (HbA1c): White Europeans and South Asians had higher regression coefficients (β≈1.7), while African Caribbeans had relatively lower coefficients (β≈1.03).
Body Mass Index (BMI): Enhanced PGS performed best in White Europeans (β≈1.71), with poorer performance in South Asians and African Caribbeans (β≈1.31 and β≈0.90, respectively).
Cardiovascular Disease (CVD) and Coronary Artery Disease (CAD): White Europeans (CVD OR=1.61; CAD OR=1.61) and South Asians (CVD OR=1.58; CAD OR=1.58) showed significantly better predictive performance than African Caribbeans (CVD OR=1.20; CAD OR=1.20).
Stroke: PGS performance in stroke prediction was similar across all ethnicities (AUC≈0.70, OR=1.20-1.40).
High-Density Lipoprotein (HDL) and Low-Density Lipoprotein (LDL): PGS for White Europeans outperformed the other two ethnic groups in predicting both HDL and LDL.
Total Cholesterol and Triglycerides: PGS for total cholesterol performed best in White Europeans, while PGS for triglycerides performed best in South Asians.
Study Limitations
The study limitations include: 1. Lack of data breadth: Most GWAS data come from White Europeans, with less data from non-white populations. 2. Validity of PGS construction: Differences in linkage disequilibrium (LD) across ethnicities may lead to different effect sizes. 3. Bias in self-reported ethnicity: Self-reported ethnicity may not fully reflect genetic ancestry information, although there is some overlap.
Research Conclusions
Polygenic scores show better predictive performance in White Europeans, with relatively poorer performance in South Asians and African Caribbeans. This suggests that more data from non-white ethnicities need to be included in GWAS analyses to produce more representative PGS, thereby avoiding exacerbating existing health inequalities when implementing PGS in clinical applications.
Scientific and Applied Value of the Research
This study emphasizes the importance of ethnic diversity in advancing precision medicine and polygenic score applications. By increasing large-scale multi-ancestral GWAS studies of non-European ethnicities, the accuracy of polygenic scores can be improved, reducing health inequalities and bringing fairer health benefits to populations of different ethnicities.