Quan, HudeLee, JoonLee, Seungwon2023-04-042022-09-06Lee, S. (2022). Applications of Data Science to Electronic Health Data in Health Services Research (Doctoral thesis, University of Calgary, Calgary, Canada). Retrieved from https://prism.ucalgary.ca.https://prism.ucalgary.ca/handle/1880/116032https://dx.doi.org/10.11575/PRISM/dspace/40878The application of data science to medical big data is an essential for achieving precision medicine and building a learning health system. There are many electronic health databases that contain big data in medicine. Largely, these electronic health databases are divided into administrative data, electronic medical records (EMR) data, and other types such as clinical registries. These databases were designed for different purposes and have informed the health system and stakeholders. Bringing together these datasets for data-driven research is an essential step. This manuscript-based thesis focuses on applying data science to electronic health data. The first part of this thesis explores the Allscripts Sunrise Clinical Manager (SCM) EMR data for research purposes, including its advantages and challenges. The work then proceeds to establish a linkage process of this database with other databases for establishing a disease cohort. The second part presents a systematic scoping review that explores how data science has been applied to similarly linked data to define conditions and comorbidities. Capturing comorbidities and outcomes is fundamental for studying treatment effects and tailoring medical decisions. The third and last part narrows the focus disease to non-alcoholic fatty liver disease and applies data science methodologies to answer specific disease-context related health services research questions. The completion of this work demonstrates the successful application of data science to electronic health data for health services research. Specifically, the first part paves the way for routinely using SCM EMR data for research in Alberta. Organizational procedures on data storage and transfer are also mapped out. These activities may not be of direct scientific value but are crucial for building the infrastructure capable of supporting scientific works. Second part informs the current data science applications on how to identify comorbidities and outcomes. This part sheds light on the potential directions of currently ongoing and future research. The third part successfully combines data analytics and existing health services research methods (i.e., epidemiology), and demonstrates that data tools can be developed to reduce the burden on care providers and the health system. Multidisciplinary collaboration and inputs from diverse perspectives are vital for achieving precision medicine.EnglishElectronic medical recordsHealth servicesData scienceHealth Sciences--GeneralApplications of Data Science to Electronic Health Data in Health Services Researchdoctoral thesis