Utilizing statistical methods to discover genetic variants underlying disease traits using multi-omics data
dc.contributor.advisor | Zhang, Qingrun | |
dc.contributor.author | Kahanda Liyanage, Rushani Nilakshika Kumari Perera | |
dc.contributor.committeemember | Ji, Yunqi Jacob | |
dc.contributor.committeemember | Wang, Haixu | |
dc.date | 2024-11 | |
dc.date.accessioned | 2024-08-28T18:47:18Z | |
dc.date.available | 2024-08-28T18:47:18Z | |
dc.date.issued | 2024-08-27 | |
dc.description.abstract | Identifying genetic variants statistically associated with specific diseases is the focus of Genome- Wide Association Studies (GWAS). Advancements in omics technologies have enabled the use of multi-omics data to bridge the gap between genotypes and their resulting phenotypes. Recently, various models have been proposed to utilize omics data for estimating polygenic terms. For example, the Image-Mediated Association Study (IMAS) leverages brain imaging data to conduct association mapping in legacy GWAS cohorts. Meanwhile, the Expression-Directed Linear Mixed Model (EDLMM) incorporates expression data to identify low-effect genetic variants, demonstrating superior performance in terms of power and real data analysis outcomes. However, most current association studies focus on a single biological unit. In our work, we developed an Image Expression Directed Linear Mixed Model (IEDLMM) which utilizes informative weights learned from training genetically predictive models for brain images using a linear mixed model and for gene expressions using a Bayesian Sparse Linear Mixed Model, to estimate the polygenic term in a linear mixed model. Through Simulations we have proven that, IEDLMM exhibits higher power than current methods while keeping the type-I error rates under control. By leveraging the UK Biobank image derived phenotypes (IDPs) and GTEx gene expression data, the IEDLMM identified 15 unique genes related to brain disorders across four datasets which are validated through DisGeNET functional annotations proving the efficacy of IEDLMM compared to existing methods. The creation of IEDLMM paves the way for additional exploration in the integration of multiple omics data within a single framework. This method not only improves the credibitility of the results but also furthers our knowledge in the field, laying a foundation for future research efforts. | |
dc.identifier.citation | Kahanda Liyanage, R. N. K. P. (2024). Utilizing statistical methods to discover genetic variants underlying disease traits using multi-omics data (Master's thesis, University of Calgary, Calgary, Canada). Retrieved from https://prism.ucalgary.ca. | |
dc.identifier.uri | https://hdl.handle.net/1880/119531 | |
dc.language.iso | en | |
dc.publisher.faculty | Graduate Studies | |
dc.publisher.institution | University of Calgary | |
dc.rights | University of Calgary graduate students retain copyright ownership and moral rights for their thesis. You may use this material in any way that is permitted by the Copyright Act or through licensing that has been assigned to the document. For uses that are not allowable under copyright legislation or licensing, you are required to seek permission. | |
dc.subject | Linear Mixed Model | |
dc.subject | Bayesian Sparse Linear Mixed Model | |
dc.subject | Multi omics | |
dc.subject.classification | Statistics | |
dc.title | Utilizing statistical methods to discover genetic variants underlying disease traits using multi-omics data | |
dc.type | master thesis | |
thesis.degree.discipline | Mathematics & Statistics | |
thesis.degree.grantor | University of Calgary | |
thesis.degree.name | Master of Science (MSc) | |
ucalgary.thesis.accesssetbystudent | I do not require a thesis withhold – my thesis will have open access and can be viewed and downloaded publicly as soon as possible. |