Privacy-Preserving Framework for Genomic Computations via Multi-Key Homomorphic Encryption
Privacy-Preserving Framework for Genomic Analysis: A Study Based on Multi-Key Homomorphic Encryption
Academic Background
With the reduction in the cost of genome sequencing, the widespread availability of genomic data has opened up new possibilities for personalized medicine (also known as genomic medicine). However, genomic data contains a vast amount of sensitive information, such as disease susceptibility, ancestry, and physical traits, raising significant privacy concerns that act as barriers to medical research and data sharing. Although researchers have proposed various privacy-preserving techniques, existing cryptography-based methods still fall short in terms of interoperability, scalability, privacy protection, and support for multi-party analysis. These limitations hinder the potential of genomic data and its application in medical research. Therefore, there is an urgent need to develop encryption technologies that can protect privacy while supporting multi-party genomic data processing.
Source of the Paper
This paper was co-authored by Mina Namazi, Mohammadali Farahpoor, Erman Ayday, and Fernando Pérez-González, affiliated with Open University of Catalonia, Case Western Reserve University, Universitat Politècnica de Catalunya, and University of Vigo, respectively. The paper was published on January 31, 2025, in the journal Bioinformatics, titled Privacy-Preserving Framework for Genomic Computations via Multi-Key Homomorphic Encryption.
Research Process
1. Research Objectives and Methods
This study aims to overcome the limitations of existing cryptographic methods by leveraging Multi-Key Homomorphic Encryption (MKHE). MKHE allows computations to be performed on encrypted data from multiple data owners without the need for decryption, thereby enabling multi-party genomic analysis while preserving privacy. The research team developed a comprehensive protocol supporting various genomic analyses, including individual genomic testing, multi-party testing, genomic database analysis, and multi-database operations.
2. Multi-Key Homomorphic Encryption Technology
The research team employed an MKHE scheme based on the Ring-Learning with Errors (RLWE) problem, proposed by Chen et al. (2019). MKHE allows computations on encrypted data from multiple data owners, with each data owner encrypting their data using their own public key. Throughout the computation process, the data remains encrypted, and the final result requires joint decryption by multiple parties. This approach not only enhances privacy protection but also eliminates the risk of a single point of failure.
3. System Model and Protocol Design
The proposed framework includes the following participants: - Certified Institution (CI): Responsible for sequencing individuals’ biological samples. - Key Authority (KA): Generates the system’s public parameters. - Cloud Server (SPU): Acts as the storage and processing unit, performing analyses on encrypted data. - Data Owners and Queriers: Includes individuals, hospitals, or other institutions that can conduct genomic analysis through the cloud server.
The research team designed the following key algorithms: - MKHSetup: Generates public parameters. - MKHKeyGen: Generates private, public, and evaluation keys for each participant. - MKHEnc: Encrypts data using public keys. - MKHPartDec: Each participant partially decrypts the encrypted result using their private key. - MKHFinDec: Combines all partially decrypted results to obtain the final decryption. - MKHEval: Performs computations on encrypted data.
4. Genomic Testing Scenarios
The research team demonstrated the framework’s application through four genomic testing scenarios: - Individual Genomic Testing: For example, personalized medicine, calculating an individual’s genetic risk score for a specific disease. - Multi-Party Testing: For example, paternity testing, comparing genetic markers between a child and an alleged father. - Genomic Database Analysis: For example, similar patient search, identifying genetically similar individuals within a database. - Multi-Database Operations: For example, record linkage, identifying and linking records belonging to the same individual across different databases.
Key Results
1. Privacy Protection and Security
The research team proved the security of the proposed framework under the semi-honest adversary model. By leveraging the MKHE scheme based on the RLWE problem, the framework ensures data privacy, keeping the data encrypted throughout the computation process. Even if the cloud server or other participants attempt to extract additional information, they cannot decrypt the data.
2. Performance and Scalability
The research team evaluated the framework’s performance, showing that the runtime scales linearly with the size of the database. For individual genomic testing, the framework completes computations within 30 seconds; for multi-party testing, it takes 53 seconds; for database analysis and multi-database operations, the computation times are 17 seconds and 35 seconds, respectively. Although the framework is slightly less efficient than existing specialized solutions, its advantages in privacy protection and multi-party computation make it highly valuable in practical applications.
3. Interoperability and Flexibility
The framework allows data owners to encrypt their data using different public keys and perform multiple analyses on the encrypted data. This design enhances the system’s interoperability, enabling data owners to conduct various genomic tests without re-encrypting their data. Additionally, the framework supports the dynamic addition of new participants, further increasing its flexibility.
Conclusion and Significance
The privacy-preserving framework based on Multi-Key Homomorphic Encryption proposed in this study holds significant scientific and practical value in the field of genomic data processing. By enabling computations on encrypted data from multiple data owners, the framework achieves multi-party genomic analysis while preserving privacy. Compared to existing methods, the framework demonstrates notable advantages in privacy protection, interoperability, and flexibility, particularly excelling in scenarios involving individual records or medium-scale databases.
Research Highlights
- Multi-Key Homomorphic Encryption Technology: The research team is the first to apply MKHE technology to genomic data processing, addressing the limitations of existing methods in privacy protection and multi-party computation.
- Comprehensive Genomic Analysis Protocol: The framework supports various genomic tests, including individual testing, multi-party testing, database analysis, and multi-database operations, showcasing its potential in diverse application scenarios.
- Performance and Scalability: The framework’s runtime scales linearly with the size of the database, making it suitable for medium-scale data analysis.
- Interoperability and Flexibility: The framework allows data owners to encrypt their data using different public keys and perform multiple analyses on the encrypted data, enhancing the system’s practicality and security.
Future Work
The research team plans to further optimize the framework’s performance, particularly for large-scale genomic data analysis. Additionally, the team will investigate how to address potential inference attacks to further strengthen the framework’s privacy protection capabilities.