Mapping Medically Relevant RNA Isoform Diversity in the Aged Human Frontal Cortex with Deep Long-Read RNA-Seq

Academic Background

RNA isoforms play a critical role in gene expression regulation. On average, human protein-coding genes contain over eight RNA isoforms, leading to nearly four distinct protein-coding sequences. Traditional short-read sequencing technologies, due to their limitations in read length, cannot accurately assemble and quantify RNA isoforms, significantly simplifying the understanding of basic biology. Increasingly, research has shown that different isoforms of the same gene have unique interaction networks at the protein level. Notably, studies have indicated that certain isoforms of the same gene can have completely different or even opposite functions within a cell. For example, the bcl-x gene, where bcl-xl has anti-apoptotic functions, while bcl-xs has pro-apoptotic functions. Therefore, identifying the functions of individual RNA isoforms is crucial for understanding disease mechanisms and developing new therapeutic approaches.

Research Source

This study was conducted by scientists from various institutions, including Bernardo Aguzzoli Heberle, J. Anthony Brandon, and Madeline L. Page, among others, from institutions such as the University of Kentucky and Emory University. The paper was published in Nature Biotechnology with the DOI: https://doi.org/10.1038/s41587-024-02245-9.

Research Workflow

The goal of this study was to explore the diversity of RNA isoforms of medical relevance in the human frontal cortex using deep long-read sequencing technology. The research workflow was divided into several steps:

Sample Collection and Processing

Twelve postmortem frontal cortex samples from elderly individuals were collected, including six Alzheimer’s Disease (AD) samples and six control samples. Each sample was sequenced using an Oxford Nanopore PromethION flow cell.

RNA Extraction and Sequencing

Using the PCR cDNA sequencing kit from Oxford Nanopore Technologies, mRNA enriched with poly(A) was sequenced. The sequencing data was processed using Guppy basecaller software for base calling of the generated .fast5 files.

Data Analysis and RNA Isoform Quantification

The Bambu software was used to quantify RNA isoforms and discover new isoforms from the sequencing data. A total of 28,989 expressed RNA isoforms were identified across the 12 samples, of which 20,183 were classified as protein-coding genes and 2,303 as long non-coding RNAs.

Protein-Level Validation

To validate the newly discovered RNA isoforms, existing mass spectrometry data and data from other studies were used. A small number of newly discovered isoforms were successfully validated at the protein level.

Novel Isoform and Differential Expression Analysis

From the known nuclear genome, 1,534 novel isoforms were discovered, of which 428 were classified as high-confidence novel isoforms. Differential expression analysis revealed 99 isoforms that were differentially expressed between Alzheimer’s Disease samples and control samples. Most of these differentially expressed isoforms came from genes that did not show differential expression at the gene level.

Research Results

Discovery of Novel RNA Isoforms

The Bambu software identified a total of 1,534 novel RNA isoforms. After screening and validation, 428 were classified as high-confidence novel isoforms, most of which came from protein-coding genes. These isoforms exhibited greater heterogeneity in long-read sequencing data.

Differential Expression of RNA Isoforms

Between Alzheimer’s Disease samples and control samples, 176 differentially expressed genes and 105 differentially expressed RNA isoforms were identified. For example, two isoforms of the tnfsf12 gene exhibited opposite expression trends, further emphasizing the importance of analyzing differential expression at the isoform level in disease research.

Isoform Diversity and Clinical Relevance

The study showed that 7,042 genes expressed two or more isoforms, and 1,917 medically relevant genes exhibited diverse isoform expressions. These findings highlight the necessity of identifying individual RNA isoforms in medical research and clinical diagnostics.

Research Conclusions and Significance

Deep long-read sequencing revealed the complexity of RNA isoforms in the human frontal cortex, particularly in disease-associated genes. This discovery is significant not only in basic scientific research but also in providing new ideas for clinical diagnostics and therapeutic development. Differential isoform analysis revealed disease-associated transcriptomic features that cannot be detected at the gene level, further demonstrating the importance of deep long-read sequencing in studying complex human diseases.

Research Highlights

  1. Discovery of Novel Isoforms: This study identified 1,534 novel isoforms from human frontal cortex samples for the first time.
  2. Protein-Level Validation: Some of the newly discovered isoforms were successfully validated at the protein level using mass spectrometry and other existing data.
  3. Differential Isoform Expression Analysis: Differential expression analysis at the isoform level revealed disease features not observable at the gene level.
  4. Application Prospects of Deep Long-Read Sequencing: The study shows that deep long-read sequencing has great potential in exploring complex disease mechanisms and developing new therapeutic methods.

Additional Information

The study provides a public web application for visualizing the expression of individual RNA isoforms in aging frontal cortex tissue (https://ebbertlab.com/brain_rna_isoform_seq.html), offering powerful tool support for further in-depth analysis and research.