Using Computational Approaches to Enhance the Interpretation of Missense Variants in the PAX6 Gene

Improving the Interpretation of PAX6 Gene Missense Variants Through Computational Methods

Background

The PAX6 gene is a highly conserved transcription factor that plays a crucial role in eye development. Heterozygous loss-of-function variants in PAX6 can lead to a range of ophthalmic disorders, including aniridia. However, many PAX6 gene missense variants are currently classified as variants of uncertain significance (VUS), posing significant challenges for molecular diagnosis. While computational tools can be used to assess the impact of gene variants, their predictive accuracy varies. In this study, the authors evaluated and optimized the performance of computational prediction tools for PAX6 missense variants.

Source Introduction

This article was completed by Nadya S. Andhika, Susmito Biswas, Claire Hardcastle, and others from multiple institutions including the Division of Evolution, Infection and Genomics, School of Biological Sciences, University of Manchester, Manchester University NHS Foundation Trust, and European Molecular Biology Laboratory (EMBL-EBI). The paper was published online in the European Journal of Human Genetics on June 7, 2024.

Research Workflow

Dataset Collection

In this study, the authors collected PAX6 missense variants from publicly available resources such as gnomAD, LOVD, HGMD, and ClinVar. Additionally, a biomedical literature search was conducted, focusing on articles between 2021 and 2023. After screening and classification, a total of 241 PAX6 missense variants were finalized for model training and evaluation, divided into two subsets: “primary dataset neutral” and “primary dataset disease”.

Computational Tool Evaluation

The study evaluated ten commonly used computational prediction tools: AlphaMissense, BayesDel, CADD, ClinPred, Eigen, MutPred2, PolyPhen-2, REVEL, SIFT4G, and VEST4. These tools use different algorithms to assess variant pathogenicity, including evolutionary conservation and protein/domain structure.

Performance Evaluation and Threshold Optimization

Initial performance evaluation showed that most tools exhibited high sensitivity but low specificity in predicting pathogenic variants. To address this issue, the authors proposed a gene-specific threshold optimization method, determining the optimal threshold for each tool through Receiver Operating Characteristic (ROC) curve analysis. The optimized thresholds significantly improved the predictive performance of the tools.

Validation and Further Assessment

To validate the initial results, the study conducted further evaluation using a five-fold cross-validation method. Additionally, variants collected from the local database of the Manchester Centre for Genomic Medicine were used for secondary analysis. This analysis further confirmed the high performance of the AlphaMissense tool under optimized thresholds.

Main Results

After threshold optimization, the AlphaMissense tool showed the highest Matthew’s Correlation Coefficient (MCC) score of 0.81, outperforming other tools. It was followed by SIFT4G and REVEL, with MCC scores of 0.77 each. Using the optimal thresholds, all tools showed significant improvements in performance parameters, especially specificity.

Specific Results

Results from the primary dataset showed that the optimized AlphaMissense tool performed excellently in predicting PAX6 missense variants, achieving sensitivity and accuracy of 96% and 89%, respectively. In the evaluation of the secondary dataset, AlphaMissense and SIFT4G tools continued to perform outstandingly, showing high sensitivity and specificity, respectively.

Conclusion and Value

This study demonstrates that adjusting the thresholds of computational tools to adapt to specific genes can significantly enhance the predictive performance for missense variants. This has important implications for the interpretation of PAX6 gene variants, providing more accurate molecular diagnostic tools in clinical settings, thereby improving the precision and timeliness of diagnosis.

Research Highlights

  1. High Efficiency of Optimized AlphaMissense: The study shows that the optimized AlphaMissense tool performs best in evaluating PAX6 missense variants, outperforming other tools.

  2. Importance of Gene-Specific Thresholds: By adopting gene-specific thresholds, the predictive performance of various common tools was significantly improved.

  3. Clinical Application Prospects: These optimized tools can provide more precise variant interpretation in clinical settings, helping doctors make more accurate diagnostic and treatment decisions.

Summary

This article provides valuable insights into how to optimize computational prediction tools for assessing variants in specific genes and proposes concrete optimization methods, offering important reference value for the clinical interpretation and diagnosis of PAX6 gene variants.