Genome, HLA and Polygenic Risk Score Analyses for Prevalent and Persistent Cervical Human Papillomavirus (HPV) Infections

Genome-wide and Polygenic Risk Score Analysis of High-Risk Human Papillomavirus (hrHPV) Infection in the Cervix

Background

Cervical high-risk human papillomavirus (hrHPV) infection is the second largest carcinogenic infection globally, accounting for approximately 31.4% of all infection-related cancers (about 690,000 cases out of 2.2 million cancer cases worldwide). Most HPV infections occur shortly after initial sexual contact, and over 90% of infections clear spontaneously within two years. However, persistent HPV infection is a necessary but insufficient condition for the development of anogenital and oropharyngeal cancers.

Although environmental factors (such as smoking, long-term use of hormonal contraceptives, HIV co-infection, etc.) have significant effects on the persistence and clearance of HPV infection, several observations suggest that genetic factors may also play an important role in the prevalence and persistence of HPV infection. However, there are still relatively few studies on genetic variations related to cervical HPV infection.

Research Source

This paper is authored by Sally N. Adebamowo, Adebowale Adeyemo, Amos Adebayo, Peter Achara, Bunmi Alabi, Rasheed A. Bakare, Ayotunde O. Famooto, Kayode Obende, Richard Offiong, Olayinka Olaniyan, Sanni Ologun, and Charles Rotimi et al. This study was conducted by the ACCME research team within the H3Africa consortium and published in the European Journal of Human Genetics (2024).

Research Methods

1. Study Selection and Genotype Analysis

The study subjects were a cohort of over 10,000 women. Research methods included a discovery genome-wide association study (GWAS), replication study, meta-analysis, and co-localization analysis, combined with polygenic risk score (PRS) and classical HLA (Human Leukocyte Antigen) allele studies.

Genotyping and imputation were performed at CIDR of Johns Hopkins University, using the Illumina H3Africa_2017_20021485 chip and MEGA chip. Imputation was done using the TopMed reference panel, and the resulting information was used for association analysis.

2. GWAS and Multiple Analyses

The GWAS case-control analysis involved 903 women with hrHPV infection, of which 224 were included in the prevalent hrHPV analysis as they were only hrHPV positive at baseline, and 679 were included in the persistent hrHPV analysis as they were positive at both baseline and follow-up. 9,846 women uninfected with HPV served as the control group.

Multiple analyses included meta-analysis of GWA data, as well as analysis of HLA alleles and HLA-peptide binding predictions. Data analysis was performed using PLINK 1.9, SNPTEST2, METAL, and METASOFT software.

3. Gene Enrichment Analysis and Functional Annotation

Functional annotation was performed using the HaploReg database, and gene enrichment analysis was conducted using the MAGMA tool. The analysis included regulatory functions of variant sites and significance analysis of gene sets. The GtEX database was queried to understand gene expression levels in different tissues.

4. Polygenic Risk Score (PRS) Model

Polygenic risk scores were constructed using PRSice-2 and PRS-CS software, and model fits were compared at different P-value thresholds to evaluate the predictive power of polygenic risk scores for predominant and persistent hrHPV infection.

Research Results

1. Genome-wide Association Analysis

  • For prevalent hrHPV, rs116471799 (near the LDB2 gene on chromosome 4) showed significant association.
  • For persistent hrHPV, rs2342234 (near TPTE2 gene), rs115537401 (near SMAD2 gene), and rs1879062 and rs1028206 (near CDH12 gene) showed significant associations.
  • Meta-analysis confirmed these associations and identified new variants associated with prevalent hrHPV, including alsopatP3CA and NCK2.

2. HLA Allele Association Analysis

  • The study found that HLA-DRB1*15:03, HLA-DRB1*13:02, HLA-DQB1*05:02, and HLA-DRB1*03:01 were associated with persistent hrHPV infection.
  • Peptide binding predictions showed that HLA-DRB1 alleles positively associated with persistent infection had weaker binding capacity to hrHPV protein-derived peptide chains, while alleles negatively associated with persistent infection showed the opposite trend.

3. Gene Enrichment Analysis

  • MAGMA analysis revealed gene sets related to p53 downstream pathway associated with prevalent hrHPV and antigen processing and presentation gene sets associated with persistent hrHPV.
  • These findings suggest that these gene sets may play important roles in the process of hrHPV infection persistence.

4. Polygenic Risk Score

  • For prevalent hrHPV, PRS models at multiple P-value thresholds showed good fit, with the best model fit reaching 3.3% (p-value 0.001).
  • For persistent hrHPV, polygenic risk score results showed moderate associations.

Research Conclusions

This study revealed several new gene loci associated with cervical hrHPV infection by examining genetic variations, HLA, and other factors. The study found significant associations between the HLA-DRB1 and DQB1 regions, as well as antigen processing and presentation gene sets, with persistent hrHPV infection.

Research Significance and Value

  • This study is the most comprehensive genome-wide association analysis of cervical hrHPV infection to date, contributing to a deeper understanding of genetic risk factors for hrHPV infection.
  • New gene loci associated with hrHPV infection were discovered, laying the foundation for further exploration of specific mechanisms and development of targeted prevention and treatment measures.
  • The study particularly focused on the genetic structure of African populations, helping to improve racial/ethnic representation and equity in gene-related research.

This research provides a new perspective for genetic studies of cervical hrHPV infection and its persistence. Further validation of results and expansion of research will provide more scientific support for this field.