Utilizing statistical methods to discover genetic variants underlying disease traits using multi-omics data

Kahanda Liyanage, Rushani Nilakshika Kumari Perera

Utilizing statistical methods to discover genetic variants underlying disease traits using multi-omics data

dc.contributor.advisor	Zhang, Qingrun
dc.contributor.author	Kahanda Liyanage, Rushani Nilakshika Kumari Perera
dc.contributor.committeemember	Ji, Yunqi Jacob
dc.contributor.committeemember	Wang, Haixu
dc.date	2024-11
dc.date.accessioned	2024-08-28T18:47:18Z
dc.date.available	2024-08-28T18:47:18Z
dc.date.issued	2024-08-27
dc.description.abstract	Identifying genetic variants statistically associated with specific diseases is the focus of Genome- Wide Association Studies (GWAS). Advancements in omics technologies have enabled the use of multi-omics data to bridge the gap between genotypes and their resulting phenotypes. Recently, various models have been proposed to utilize omics data for estimating polygenic terms. For example, the Image-Mediated Association Study (IMAS) leverages brain imaging data to conduct association mapping in legacy GWAS cohorts. Meanwhile, the Expression-Directed Linear Mixed Model (EDLMM) incorporates expression data to identify low-effect genetic variants, demonstrating superior performance in terms of power and real data analysis outcomes. However, most current association studies focus on a single biological unit. In our work, we developed an Image Expression Directed Linear Mixed Model (IEDLMM) which utilizes informative weights learned from training genetically predictive models for brain images using a linear mixed model and for gene expressions using a Bayesian Sparse Linear Mixed Model, to estimate the polygenic term in a linear mixed model. Through Simulations we have proven that, IEDLMM exhibits higher power than current methods while keeping the type-I error rates under control. By leveraging the UK Biobank image derived phenotypes (IDPs) and GTEx gene expression data, the IEDLMM identified 15 unique genes related to brain disorders across four datasets which are validated through DisGeNET functional annotations proving the efficacy of IEDLMM compared to existing methods. The creation of IEDLMM paves the way for additional exploration in the integration of multiple omics data within a single framework. This method not only improves the credibitility of the results but also furthers our knowledge in the field, laying a foundation for future research efforts.
dc.identifier.citation	Kahanda Liyanage, R. N. K. P. (2024). Utilizing statistical methods to discover genetic variants underlying disease traits using multi-omics data (Master's thesis, University of Calgary, Calgary, Canada). Retrieved from https://prism.ucalgary.ca.
dc.identifier.uri	https://hdl.handle.net/1880/119531
dc.language.iso	en
dc.publisher.faculty	Graduate Studies
dc.publisher.institution	University of Calgary
dc.rights	University of Calgary graduate students retain copyright ownership and moral rights for their thesis. You may use this material in any way that is permitted by the Copyright Act or through licensing that has been assigned to the document. For uses that are not allowable under copyright legislation or licensing, you are required to seek permission.
dc.subject	Linear Mixed Model
dc.subject	Bayesian Sparse Linear Mixed Model
dc.subject	Multi omics
dc.subject.classification	Statistics
dc.title	Utilizing statistical methods to discover genetic variants underlying disease traits using multi-omics data
dc.type	master thesis
thesis.degree.discipline	Mathematics & Statistics
thesis.degree.grantor	University of Calgary
thesis.degree.name	Master of Science (MSc)
ucalgary.thesis.accesssetbystudent	I do not require a thesis withhold – my thesis will have open access and can be viewed and downloaded publicly as soon as possible.