Genome-aware annotation of CRISPR guides validates targets in variant cell lines and enhances discovery in screens
Genomic Medicine and Re-annotation of CRISPR Guides: Application and Validation of the EXORCISE Algorithm
Academic Background
The advent of CRISPR-Cas9 technology has revolutionized genetic screening, particularly in studying gene essentiality and chemo-genetic interactions. By designing guide RNAs (gRNAs) that target specific genes, the CRISPR-Cas9 system introduces precise genetic knockouts in cells, aiding researchers in understanding gene functions and their roles in disease. However, CRISPR libraries are often designed based on reference genomes, while the cell lines used for research (especially cancer cell lines) frequently harbor genomic variations. This can lead to mismatches or biases in CRISPR guide sequences, affecting the accuracy of experimental results.
To address this issue, Simon Lam and colleagues developed an algorithm called EXORCISE (Exome-Guided Re-annotation of Nucleotide Sequences). This tool uses genome alignment and exon annotation to re-annotate CRISPR guide sequences, rectifying errors in CRISPR libraries and enhancing the discovery power of genetic screens.
Source of the Paper
The paper was authored by Simon Lam, John C. Thomas, and Stephen P. Jackson from the Cancer Research UK Cambridge Institute at the University of Cambridge. It was published in Genome Medicine in 2024 under the title “Genome-aware annotation of CRISPR guides validates targets in variant cell lines and enhances discovery in screens.”
Research Workflow and Findings
1. Development and Implementation of the EXORCISE Algorithm
The core concept of the EXORCISE algorithm is to re-annotate CRISPR guide sequences by aligning them with user-supplied genomes and combining exon annotations. The workflow includes:
- Guide Sequence Alignment: CRISPR guide sequences are aligned to the user-provided genome using BLAT (BLAST-like alignment tool) for precise matching.
- Determination of Cutting Sites: For each aligned guide sequence, the Cas9 cutting site is determined, usually located between the 3rd and 4th nucleotides upstream of the PAM (Protospacer Adjacent Motif) sequence.
- Exon Annotation: Cutting sites are cross-referenced with exon annotations. If the cutting site lies within an exon, the guide sequence is annotated to target the gene corresponding to that exon.
- Output of Re-annotations: The re-annotated CRISPR library is outputted, providing mappings between original and corrected annotations.
2. Evaluation of Commercial CRISPR Libraries
Using EXORCISE, the researchers re-annotated 55 commercially available CRISPR libraries and uncovered prevalent issues, including:
- Off-Target Effects: Some guides targeted exons in multiple genes, leading to off-target effects. Off-target effects accounted for 7.4% of guides annotated with RefSeq exons and increased to 12.9% with the more permissive GENCODE annotations.
- Missed-Target Effects: Guides that failed to target any exon were classified as having missed-target effects, accounting for up to 16.1% with RefSeq annotations, decreasing to 9.6% with GENCODE annotations.
- False Non-Targeting Effects: Some guides had valid targets but lacked proper annotations, causing false non-targeting effects.
3. Simulated CRISPR Screening Experiments
To assess the impact of common annotation errors on CRISPR screening results, the team constructed a synthetic genome and simulated CRISPR screening experiments. By incorporating different annotation errors (e.g., false non-targeting, missed-target, and boundary effects), the researchers observed:
- False Non-Targeting Effects: While reducing the number of discoveries, these errors preserved discovery precision.
- Missed-Target Effects: The addition of extra non-targeting guides significantly weakened the discovery strength.
- Boundary Effects: Errors in exon boundaries affected the strongest signals while compromising the detection of medium-strength signals.
4. Application to DepMap and DDRCS Datasets
The researchers applied EXORCISE to data from DepMap (the Cancer Dependency Map) and DDRCS (DNA Damage Response CRISPR Screen Portal) datasets. The re-annotated CRISPR libraries demonstrated improved discovery strength for intermediate signals, particularly in cancer cell lines. EXORCISE also inferred exons from transcriptome data, which helped correct missed-target effects.
5. Design and Validation of New Libraries
The team designed “VBC Ideal Human” and “VBC Ideal Mouse” CRISPR libraries using EXORCISE. By removing guides with off-target or missed-target effects, they ensured a consistent number of guides per gene. These new libraries exhibited optimized targeting efficiency and discovery capabilities.
Conclusions and Implications
The EXORCISE algorithm provides an essential tool for improving the accuracy of CRISPR screening experiments, particularly when working with cell lines harboring genomic variations. By re-annotating CRISPR guide sequences, EXORCISE rectifies common annotation errors and enhances the discovery power of experiments.
Key Highlights
- Correction of Annotation Errors: EXORCISE effectively identifies and rectifies off-target, missed-target, and false non-targeting effects in CRISPR libraries.
- Enhanced Discovery of Intermediate Signals: Re-annotation improves the detection of medium-strength signals, uncovering many potential gene-drug interactions.
- Applicable to Diverse Cell Lines: EXORCISE supports custom genome and exon inputs, making it suitable for various cell lines, especially genomically unstable cancer cell lines.
Practical Applications
EXORCISE extends beyond CRISPR screening experiments and can be applied to other DNA sequence annotation tasks based on genome alignment. Its open-source nature (Creative Commons Zero 1.0 Universal License) ensures broad accessibility, advancing research in genomics and CRISPR technology.
By leveraging the EXORCISE algorithm, researchers can significantly enhance the accuracy and output of CRISPR screening experiments, paving the way for novel discoveries in genomics and precision medicine.