SCICONE: Single-Cell Copy Number Calling and Event History Reconstruction
During tumor development, copy number alterations (CNAs) are key drivers of tumor heterogeneity and evolution. Understanding these variations is crucial for developing personalized cancer diagnostics and therapies. Single-cell sequencing technology offers the highest resolution for copy number analysis, down to the individual cell level. However, low read-depth whole-genome sequencing data poses significant statistical and computational challenges for detecting copy number variations. Most existing computational methods overlook the evolutionary relationships between cells, leading to less accurate detection results. Therefore, there is an urgent need to develop a method for copy number detection that incorporates the evolutionary history of cells.
Source of the Paper
This paper was jointly completed by researchers from ETH Zurich (Swiss Federal Institute of Technology) and the SIB Swiss Institute of Bioinformatics. The main authors include Jack Kuipers, Mustafa Anıl Tuncel, Pedro F. Ferreira, Katharina Jahn, and Niko Beerenwinkel. The paper was published in 2025 in the journal Bioinformatics under the title “Single-cell copy number calling and event history reconstruction.”
Research Process
1. Research Objectives and Methodology Overview
The core objective of this study is to develop a statistical model and Markov Chain Monte Carlo (MCMC) algorithm called SCICONE, designed to reconstruct the history of copy number variation events from low read-depth single-cell whole-genome sequencing data and infer the copy number profiles of individual cells. By integrating the evolutionary relationships of cells, SCICONE improves the accuracy of copy number detection.
2. Data Preprocessing and Segmentation
The study first preprocesses single-cell sequencing data, including corrections for GC content and mapping biases. It then employs a dynamic programming approach to detect breakpoints in the genome, dividing it into segments with the same copy number. The key to this step lies in combining signals from multiple cells to identify regions with significant copy number changes.
3. Construction of Copy Number Event Trees
SCICONE models the evolutionary history of tumors by constructing copy number event trees (CNA trees). The nodes of the event tree represent copy number events (such as amplifications or deletions), while the tree’s topology reflects the order and relationships of these events. The study uses an MCMC algorithm to sample the event trees to find the most likely tree structures and event combinations.
4. Inference of Copy Number Profiles
Based on the event tree, SCICONE infers the copy number profiles of each cell. By assigning cells to nodes in the event tree, the study can infer the copy number state of each cell based on the events along the path. This process not only improves the accuracy of copy number detection but also reveals the clonal structure of the tumor.
5. Validation with Simulated and Real Data
The study validates SCICONE’s performance using both simulated and real tumor samples. The simulated data covers different read depths and segment numbers, showing that SCICONE excels under low read depth and high noise conditions. In real data, SCICONE successfully reconstructs the copy number evolutionary history of triple-negative breast cancer samples and detects key driver gene variations.
Main Results
1. Validation with Simulated Data
In simulated data, SCICONE performs exceptionally well under low read depth (2x-8x) and high noise conditions, significantly outperforming other methods (such as HMMCopy, Ginkgo, and Scope) in copy number detection accuracy. Its advantages are particularly evident when handling small segment copy number events.
2. Application to Real Data
In triple-negative breast cancer samples, SCICONE reconstructs the tumor’s clonal structure and detects copy number variations in key genes such as TP53, PIK3CA, and AKT1. Additionally, the study identifies a whole-genome duplication event, which is significant in tumor evolution.
3. Algorithm Performance Comparison
Compared to existing copy number detection methods, SCICONE shows significant improvements in accuracy and robustness. Especially when processing low read-depth data, SCICONE can better separate signal from noise, providing more reliable copy number profiles.
Conclusions and Significance
SCICONE offers a novel method for single-cell copy number detection by integrating the evolutionary relationships of cells. Its core advantage lies in simultaneously inferring copy number profiles and evolutionary history, thereby more accurately revealing the clonal structure and evolutionary dynamics of tumors. This method not only holds significant scientific value but also provides new tools for personalized cancer treatment.
Research Highlights
- Integration of Evolutionary History: SCICONE is the first to combine copy number detection with cell evolutionary history, significantly improving detection accuracy.
- Dynamic Programming for Breakpoint Detection: The study develops a dynamic programming-based breakpoint detection method, effectively identifying regions of copy number changes.
- Optimization with MCMC Algorithm: Using the MCMC algorithm, SCICONE efficiently searches the complex tree structure space to find the most likely evolutionary models.
- Broad Application Prospects: SCICONE is not only suitable for low read-depth single-cell sequencing data but can also be applied to targeted sequencing and multi-omics data analysis, offering wide-ranging application potential.
Additional Valuable Information
The research team has provided an open-source implementation of SCICONE, available on GitHub (https://github.com/cbg-ethz/scicone). Additionally, the study offers detailed Snakemake workflows to facilitate the replication of experimental results by other researchers.
Through the innovative methods of this study, future breakthroughs are expected in tumor evolution analysis, cancer diagnostics, and the optimization of treatment strategies.