Josephson, Colin BruceSajobi, TolulopeLin, Chantelle Qing Yang2024-01-032024-01-032024-01-02Lin, C. Q. Y. (2023). Predicting the side effects of antiseizure medications using machine learning models (Master's thesis, University of Calgary, Calgary, Canada). Retrieved from https://prism.ucalgary.ca.https://hdl.handle.net/1880/117843https://doi.org/10.11575/PRISM/42686With over 20 anti-seizure medications (ASMs), identifying the ideal drug is often imprecise and time-consuming. Developing predictive models to expedite optimal drug selection is challenging due to the minimal differences in efficacy among adult patients with epilepsy. However, side-effects vary considerably between medications, and are one of the main reasons for discontinuation of ASM treatment. The aim was to (1) assess the prognostic utility of high- dimensional data such as genetic features with clinical features to predict ASM discontinuation, and (2) determine the optimal regression/machine learning model for predicting ASM discontinuation. This retrospective cohort study included 4,853 exposures to any ASM, and 624 patients exposed to valproic acid (VPA) from the RAISE-GENIC study during the years 2006-2020. The predicted outcome was defined as ASM discontinuation due to any side-effect reported by the patient. Clinical features included age of onset, patient age, sex, comorbidities, seizure type, EEG variables, and imaging variables. Network analysis of mRNA expression data from VPA-exposed neurons derived from control induced pluripotent stem cells (iPSCs) was leveraged to extract exome sequencing and genome-wide single nucleotide polymorphism data. Features were selected for model inclusion based on relevance as determined by the ReliefF algorithm. Penalized logistic regression, support vector machine, random forest, and k-nearest neighbor models were trained on the normalized bootstrapped dataset and model quality was assessed using stratified 10-fold cross validation. Models with only clinical and combined clinical and genetic features were compared by quantitative as well as visual discrimination and calibration metrics. The results showed that the best performing model was the penalized logistic regression using the VPA dataset with genetic and clinical features. The accuracy was 0.75 [95% confidence interval 0.74-0.76], area under the receiver operating characteristic curve was 0.66 [0.66-0.67], Brier score was 0.20 [0.19-0.21], sensitivity was 0.42 [0.41-0.42], and specificity 0.82 [0.82-0.83]. Machine learning using clinical and genetic features can moderately predict treatment-ending side- effects to VPA with moderate performance, discrimination, and calibration. If these results can be validated and improved upon, decision tools can be incorporated into clinical routines, simplifying drug prescriptions, saving time, and improving patient quality of life.enUniversity of Calgary graduate students retain copyright ownership and moral rights for their thesis. You may use this material in any way that is permitted by the Copyright Act or through licensing that has been assigned to the document. For uses that are not allowable under copyright legislation or licensing, you are required to seek permission.BiostatisticsEducation--HealthArtificial IntelligenceEducation--HealthEpidemiologyPredicting the Side Effects of Antiseizure Medications Using Machine Learning Modelsmaster thesis