Characterizing genetic basis of complex diseases by integrating data-bridge and genomics
dc.contributor.advisor | Long, Quan | |
dc.contributor.author | He, Jingni | |
dc.contributor.committeemember | de Koning, Jason | |
dc.contributor.committeemember | Tekougang, Thierry Chekouo | |
dc.date | 2023-09 | |
dc.date.accessioned | 2023-07-06T20:19:24Z | |
dc.date.available | 2023-07-06T20:19:24Z | |
dc.date.issued | 2023-06 | |
dc.description.abstract | With the advancement of high-throughput sequencing and genotyping technology, many multi-omics data are generated in the genomic projects. Such multi-omics data are in between of genotype and phenotype, therefore, may serve as data-bridges to help statistical genetic analyses. How to effectively integrate such data-bridges brings challenges and opportunities for statistical geneticists. For instances, the problem of statistical overfitting, the question of seamlessly integrating biological priors with high-dimensional data, and the interpretation of statistical results in the context of biology. The works in this thesis focus on integrating such data-bridges to characterize the genetic basis of complex diseases and addressing the aforementioned challenges. I have developed novel statistical models of analyzing multi-omics data from four perspectives: (Q1) How to integrate biological priors such as transcription factors with statistical models; (Q2) How to utilize trans- regulatory variants while keeping the model robust despite the large number of possible candidates; (Q3) How to utilize data-bridges to improve the modeling of rare genetic variants; and (Q4) How to utilize brain imaging data in genetic association mapping. These efforts led to four novel statistical models and their implementation: namely, (M1) sTF-TWAS, which integrates the prior knowledge of transcription factors (TF) with association study; (M2) transTF-TWAS, which utilizes Group Lasso to incorporate TF-linked trans-located variants; (M3) rvTWAS, which leverages transcriptome-directed feature selection towards rare variants; and (M4) IMAS, which uses borrowed brain images to conduct image-directed feature selection and aggregations. All these four methods are verified by comprehensive simulations based on known genetic architectures and heritability models. Utilizing the large-scale omics data accessed through dbGaP and UK Biobank, as well as the large cohorts from our collaborator, I have applied them to cancers and neuropsychiatric disorders, yielding the discovery of additional genes underlying complex traits. I have also thoroughly validated the methods by analyzing the discoveries using existing biological literature and databases. The development of these methods opens a door for integrating data-bridges such as transcriptomes and imaging data in genetic mapping. The novel findings provide additional insights into the genetic basis of cancers and brain disorders. | |
dc.identifier.citation | He, J. (2023). Characterizing genetic basis of complex diseases by integrating data-bridge and genomics (Doctoral thesis, University of Calgary, Calgary, Canada). Retrieved from https://prism.ucalgary.ca. | |
dc.identifier.uri | https://hdl.handle.net/1880/116710 | |
dc.identifier.uri | https://dx.doi.org/10.11575/PRISM/41552 | |
dc.language.iso | en | |
dc.publisher.faculty | Graduate Studies | |
dc.publisher.institution | University of Calgary | |
dc.rights | University of Calgary graduate students retain copyright ownership and moral rights for their thesis. You may use this material in any way that is permitted by the Copyright Act or through licensing that has been assigned to the document. For uses that are not allowable under copyright legislation or licensing, you are required to seek permission. | |
dc.subject.classification | Education--Health | |
dc.subject.classification | Bioinformatics | |
dc.subject.classification | Genetics | |
dc.title | Characterizing genetic basis of complex diseases by integrating data-bridge and genomics | |
dc.type | doctoral thesis | |
thesis.degree.discipline | Medicine – Biochemistry and Molecular Biology | |
thesis.degree.grantor | University of Calgary | |
thesis.degree.name | Doctor of Philosophy (PhD) | |
ucalgary.thesis.accesssetbystudent | I require a thesis withhold – I need to delay the release of my thesis due to a patent application, and other reasons outlined in the link above. I have/will need to submit a thesis withhold application. |