Non-linear Multi Omics Data Integration Method Using Conditional Variational Autoencoders

Date
2025-01-27
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract

Advances in technology have enabled the study of diseases through multi-omics data, which combines information from genome, epigenome, transcriptome, proteome, and metabolome levels. Unlike single-omics approaches that provide limited insights, multi-omics integration offers a comprehensive understanding of biological systems by capturing interactions across molecular layers. In recent years, several methods have been developed to integrate omics data. For example, Simidjievski et al., 2019 introduced techniques that use Variational Autoencoders (VAEs) for data integration. Similarly, Zarayeneh et al., 2017 proposed a method called the Integrative Gene Regulatory Network (iGRN), which combines multiple layers of omics data using a network made up entirely of gene nodes. This thesis focuses on developing data integration architectures based on conditional variational autoencoders (CVAEs). The key advantage of this approach is that it allows class label information to be incorporated during the data integration process. To the best of our knowledge, CVAEs have not been applied in previous multi-omics research. Additionally, new methods for integrating more than two datasets using CVAEs have been introduced. This is a novel contribution to the field of multiomics data integration, as no prior studies have explored the use of CVAEs for integrating multiple datasets in this context. The proposed architectures were tested on both real and simulated datasets. The results from both studies showed that adding an outcome variable (class labels) to regular VAEs improved predictive performance. Additionally, integrating data from multiple datasets produced better results compared to using a single dataset for predictions or using VAEs without incorporating labels.

Description
Keywords
Conditional Variational Autoencoder, Multi-Omics Data
Citation
Gustinna Wadu, D. (2025). Non-linear multi omics data integration method using conditional variational autoencoders (Master's thesis, University of Calgary, Calgary, Canada). Retrieved from https://prism.ucalgary.ca.