Utilizing statistical methods to discover genetic variants underlying disease traits using multi-omics data

dc.contributor.advisorZhang, Qingrun
dc.contributor.authorKahanda Liyanage, Rushani Nilakshika Kumari Perera
dc.contributor.committeememberJi, Yunqi Jacob
dc.contributor.committeememberWang, Haixu
dc.date2024-11
dc.date.accessioned2024-08-28T18:47:18Z
dc.date.available2024-08-28T18:47:18Z
dc.date.issued2024-08-27
dc.description.abstractIdentifying genetic variants statistically associated with specific diseases is the focus of Genome- Wide Association Studies (GWAS). Advancements in omics technologies have enabled the use of multi-omics data to bridge the gap between genotypes and their resulting phenotypes. Recently, various models have been proposed to utilize omics data for estimating polygenic terms. For example, the Image-Mediated Association Study (IMAS) leverages brain imaging data to conduct association mapping in legacy GWAS cohorts. Meanwhile, the Expression-Directed Linear Mixed Model (EDLMM) incorporates expression data to identify low-effect genetic variants, demonstrating superior performance in terms of power and real data analysis outcomes. However, most current association studies focus on a single biological unit. In our work, we developed an Image Expression Directed Linear Mixed Model (IEDLMM) which utilizes informative weights learned from training genetically predictive models for brain images using a linear mixed model and for gene expressions using a Bayesian Sparse Linear Mixed Model, to estimate the polygenic term in a linear mixed model. Through Simulations we have proven that, IEDLMM exhibits higher power than current methods while keeping the type-I error rates under control. By leveraging the UK Biobank image derived phenotypes (IDPs) and GTEx gene expression data, the IEDLMM identified 15 unique genes related to brain disorders across four datasets which are validated through DisGeNET functional annotations proving the efficacy of IEDLMM compared to existing methods. The creation of IEDLMM paves the way for additional exploration in the integration of multiple omics data within a single framework. This method not only improves the credibitility of the results but also furthers our knowledge in the field, laying a foundation for future research efforts.
dc.identifier.citationKahanda Liyanage, R. N. K. P. (2024). Utilizing statistical methods to discover genetic variants underlying disease traits using multi-omics data (Master's thesis, University of Calgary, Calgary, Canada). Retrieved from https://prism.ucalgary.ca.
dc.identifier.urihttps://hdl.handle.net/1880/119531
dc.language.isoen
dc.publisher.facultyGraduate Studies
dc.publisher.institutionUniversity of Calgary
dc.rightsUniversity of Calgary graduate students retain copyright ownership and moral rights for their thesis. You may use this material in any way that is permitted by the Copyright Act or through licensing that has been assigned to the document. For uses that are not allowable under copyright legislation or licensing, you are required to seek permission.
dc.subjectLinear Mixed Model
dc.subjectBayesian Sparse Linear Mixed Model
dc.subjectMulti omics
dc.subject.classificationStatistics
dc.titleUtilizing statistical methods to discover genetic variants underlying disease traits using multi-omics data
dc.typemaster thesis
thesis.degree.disciplineMathematics & Statistics
thesis.degree.grantorUniversity of Calgary
thesis.degree.nameMaster of Science (MSc)
ucalgary.thesis.accesssetbystudentI do not require a thesis withhold – my thesis will have open access and can be viewed and downloaded publicly as soon as possible.
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
ucalgary_2024_kahanda-liyanage_rushani.pdf
Size:
3.45 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.62 KB
Format:
Item-specific license agreed upon to submission
Description: