Browsing by Author "Williamson, Tyler S."
Now showing 1 - 12 of 12
Results Per Page
Sort Options
Item Open Access Classification Models for Multivariate Non-normal Repeated Measures Data(2021-01-08) Brobbey, Anita; Sajobi, Tolulope T.; Wiebe, Samuel; Williamson, Tyler S.; Nettel-Aguirre, AlbertoMultivariate repeated measures data, in which multiple outcomes are repeatedly measured at two or more occasions, are commonly collected in several disciplines (e.g., medicine, ecology, environmental sciences), where investigators seek to discriminate between population groups or make predictions based on changes in multiple correlated outcomes over time. Repeated measures discriminant analysis have been developed and applied to address these research questions. These classification models, which have been mostly developed based on growth curve models, covariance pattern models, and mixed-effects models, are advantageous in that they can account for complex correlation structures in multivariate repeated measures data (e.g., within-outcome and between-outcome correlations) to improve their predictive accuracy. However, they largely rely on the assumption of multivariate normality, which is rarely satisfied in multivariate repeated measures data. To our knowledge, there has been limited investigation of the behavior of these existing models in multivariate non-normal repeated measures data. The overarching goal of this research was to develop robust repeated measures discriminant analysis classifiers for multivariate non-normal repeated measures data. Specifically, we developed repeated measures discriminant analysis based on maximum trimmed likelihood estimators (MTLE) and generalized estimating equations (GEE) estimators and examine their accuracy in comparison to classifiers based on maximum likelihood estimation (MLE) using Monte Carlo methods. The simulation conditions examined, included population distribution, sample size, covariance structure (between-outcomes and within-outcome), covariance heterogeneity, repeated number of occasions, and number of outcome variables. The Monte Carlo study results indicated that the proposed methods increased overall mean classification accuracy by 2% - 15% in multivariate non-normal repeated measures data compared to repeated measures discriminant analysis based on MLE under most scenarios. Data from two cohort studies were used to illustrate the implementation of the proposed repeated measures discriminant analysis methods. The outcomes of this research includes novel multivariate classifiers for predicting group membership in multivariate normal and non-normal repeated measures data. This research contributes to the advancement of statistical science on methods for analyzing multivariate repeated measures data.Item Open Access Drip and Ship versus Mothership: Transportation and Treatment Strategies for Acute Ischemic Stroke Patients(2019-07-22) Holodinsky, Jessalyn Kathryn; Hill, Michael D.; Williamson, Tyler S.; Demchuk, Andrew M.; Patel, Alka B.Ischemic stroke with large vessel occlusion can be treated with alteplase and/or endovascular therapy. While endovascular therapy has been proven more effective than alteplase the administration of both treatments is highly time sensitive. There are geographic disparities in access to endovascular therapy. For patients outside the immediate vicinity of a hospital equipped to perform endovascular therapy it is unknown whether transport directly to an endovascular center for alteplase and endovascular therapy (mothership) or transport to the closest centre for immediate alteplase treatment followed by transfer to the endovascular center (drip-and-ship) will result in best patient outcomes. In this thesis, this is explored using theoretical conditional probability. Models were generated using existing data from clinical trials of stroke treatment, the accuracy of prehospital large vessel occlusion screening tools, and time from onset to stroke treatment (a function of both geography and hospital efficiency). The models were used to determine which strategy predicts the greatest probability of excellent outcome for stroke patients in several different scenarios. The optimal transport strategy is influenced by three different factors, the impact of which is summarized as follows from the perspective of the drip and ship approach. First, the most probable diagnosis of the patient. As the positive predictive value of the large vessel occlusion screening tool decreases the importance of the drip and ship model is appreciated. Second, the speed of treatment at the receiving hospitals. Fast treatment at thrombolysis centres is key for the drip and ship model to remain viable. Finally, the patient’s travel time to and between the different hospitals. As the distance between the thrombolysis and endovascular centre increases again the importance of the drip and ship model is realized. This thesis presents a novel way of conceptualizing the pre-hospital transport of suspected stroke patients. Decision-making for pre-hospital transport can be modelled using existing clinical trial data; these models can be dynamically adapted to changing realities. As the radius of superiority of the different transport strategy is context specific regional customization transport protocols for stroke patients is essential.Item Open Access Enhancing Primary Care Electronic Medical Record (EMR) Data in Alberta by Quality Assessment, Data Processing, and Linkage to Administrative Data(2020-07-01) Garies, Stephanie; Quan, Hude; Williamson, Tyler S.; Drummond, Neil A.; McBrien, Kerry AlisonThe growth of electronic medical record (EMR) systems in healthcare settings has created opportunities for EMR data to be reused for secondary purposes. Since EMR data are generated from clinical and administrative processes, the suitability for other uses (e.g. surveillance or research) is questionable. Assessing data quality is important for understanding the database contents, identifying potential limitations or biases, and determining how ‘fit for purpose’ the data are. This thesis focused on evaluating and improving the quality of primary care EMR data in Alberta. Data quality, which is highly contextual, was examined from the perspective of use for hypertension surveillance, as hypertension is a prevalent chronic condition associated with poor health outcomes and high cost implications. The first part of this thesis involved developing a comprehensive description of EMR data capture, extraction, and processing by the Canadian Primary Care Sentinel Surveillance Network (CPCSSN) in Alberta. The second section presented a data quality assessment using CPCSSN data elements relevant to hypertension surveillance. The third part explored multiple imputation and a pattern-matching algorithm for improving smoking status records in the EMR data. Lastly, EMR and administrative data for a cohort of hypertensive patients were linked and described. The CPCSSN process documentation and data quality assessment created novel, useful, and comprehensive information for data users. CPCSSN data appear to be suitable for hypertension surveillance, though caution is warranted for several variables of inconsistent quality. Multiple imputation improved completeness of patient smoking statuses, but the lack of an appropriate external reference source made confirming accuracy difficult. The pattern-matching algorithm demonstrated high accuracy for categorizing smoking status; however, it missed classifying 24% of patients. Lastly, EMR data for 6,307 hypertensive patients were successfully linked to five administrative databases. Although this linked sample is relatively small and may be subject to selection bias (limiting the generalizability for surveillance purposes), the cohort could be useful for health outcomes research or validating elements in the EMR or administrative databases. This work has informed the development of more efficient processes for EMR-administrative linkages. Data quality assessment outcomes will be made available to inform various types of CPCSSN data users.Item Open Access Examining Neighbourhood Socioeconomic Status, Anxiety and Depression during Pregnancy, and Preterm Birth(2019-07-10) Adhikari Dahal, Kamala; Metcalfe, Amy; Patten, Scott B.; Williamson, Tyler S.; Patel, Alka B.Background: Understanding of influence of anxiety, depression, and neighbourhood socioeconomic status (SES) on the risk of preterm birth (PTB) is unclear. This doctoral research examined the ability of neighbourhood SES to predict the risk of PTB, the utility of existing anxiety scales in measuring anxiety in pregnancy, and whether neighbourhood SES modified the association between anxiety and depression during pregnancy and PTB. Methods: This study used data from two pregnancy cohort studies in Alberta, Canada (n=5,528). The data were linked to neighbourhood SES data, derived from the Canadian census. A multilevel logistic regression prediction model was developed to examine whether neighbourhood SES improves the prediction of PTB. Confirmatory factor analysis and Spearman correlation were used to examine the utility of anxiety scales in pregnancy. A multivariable logistic regression model was used to assess whether neighbourhood SES modifies the association between anxiety and/or depression and PTB. Results: Neighbourhood level variance explained PTB by 6%. Neighbourhood SES combined with maternal characteristics predicted PTB with an area under the receiver operating characteristic curve (AUC) of 0.75. Maternal characteristics alone had AUC of 0.60. The model fit of anxiety scales ranged from inadequate to adequate. The correlation between the scales was low to moderate. The presence of both anxiety and depression, but neither anxiety nor depression alone, was significantly associated with PTB (OR=1.57, 95% CI=1.07, 2.29) and had significant interaction with neighbourhood deprivation (p-value=0.014). Conclusions: This research may suggest that women’s neighbourhood SES improves overall prediction of PTB and that it modifies the effects of anxiety and depression on risk of PTB. It may also indicate that existing anxiety scales do not measure anxiety as a single dimension and they are incomparable. These findings may guide the identification of women at increased risk for PTB and future research in the field.Item Open Access Hospitalization Costs of Canadian Cystic Fibrosis Patients(2020-09-29) Skolnik, Kate; Williamson, Tyler S.; Quon, Bradley S.; Pendharkar, Sachin R.; Ronksley, Paul EverettIntroduction: Cystic fibrosis (CF) is a genetic disease that can lead to significant morbidity. As the CF population increases and treatment regimens escalate in complexity, CF care costs are expected to rise and could put tremendous strain on health care systems. Our aim was to examine the hospitalization costs of Canadian CF patients. Methods: We performed an analysis of annual CF hospital costs for the 2014 fiscal year using a public payer perspective. Secondary objectives were to examine differences in annual hospital costs for Canadian CF patients (1) by patient characteristics, (2) between provinces, and (3) over time (from 2010 to 2014). Record level data were obtained from the Canadian Institute for Health Information databases. CF patients were defined based on at least one hospital admission with an ICD-10 code of E84. Costs were estimated using a case-mix aggregate costing strategy. Results: In 2014, 953 of 2,702 (35%) Canadians with CF had 1,705 hospitalizations resulting in a total cost of $32.1 million. Mean hospital cost per patient and mean cost per hospitalization were $34,982 and $19,782, respectively. There were no differences in mean cost per hospitalization by age or sex. Mean cost per hospitalization was highest among those admitted for pneumothorax ($22,685), followed by CF pulmonary exacerbation ($21,130) and distal intestinal obstruction syndrome ($18,816). The mean cost per hospitalization was highest for Alberta ($25,229) and lowest for NB ($10,734). Between 2010 and 2014, the total cost of all hospitalizations for CF patients increased by 17% ($27.4 to $32.1 million). Conclusion: Canadian CF hospitalizations are costly; these costs vary by type of admitting diagnosis and are increasing over time. These national estimates will inform health care planning as well as future cost effectiveness analyses for CF interventions.Item Open Access Machine learning models for functional impairment risk prediction in ischemic stroke patients(2020-09-03) Alaka, Shakiru Ayomide; Sajobi, Tolulope T.; Menon, Bijoy K.; Hill, Michael D.; Williamson, Tyler S.Background: Stroke-related functional impairment risk scores are commonly used to estimate the patient-specific risk of functional impairment in acute care settings. However, these models have been primarily developed based on regression models, which might not provide optimal predictive accuracy, especially when validated in an external cohort. Purpose: To evaluate the predictive accuracy of machine-learning (ML) models for predicting functional impairment risk in acute ischemic stroke patients. Second, to compare the predictive accuracy of machine-learning models and regression-based models using computer simulations. Methods: Using data from the Precise and Rapid Assessment of Collaterals with Multi-phase CT Angiography (PROVE-IT). The Modified Rankin Scale (mRS) score was used to assess the 90-day functional impairment status. The accuracy of machine-learning models such as random forest (RF), classification and regression tree (CART), support vector machine (SVM), C5.0 decision tree (DT), adaptive boost machine (ABM), and least absolute shrinkage and selection operator (LASSO) logistic regression, and logistic regression (LR) was used to predict the risk of patient-specific risk of 90-day functional impairment. Area under the receiver operating characteristic curve (AUC) sensitivity, specificity, Mathews correlation coefficient (MCC) and Brier score was used to assess the predictive accuracy of these models via internal cross-validation and external validation in the Identifying New Approaches to Optimize Thrombus Characterization for Predicting Early Recanalization and Reperfusion with IVtPA Using Serial CT Angiography (INTERSSeCT) cohort study. Monte Carlo methods were used to develop recommendations for selecting machine-learning models under a variety of data characteristics. Results: Both logistic regression and machine-learning models had comparable predictive accuracy when validated internally (AUC range = [0.65 – 0.72]; MCC range = [0.29 - 0.42]) and externally (AUC range = [0.66 – 0.71]; MCC range = [0.34 – 0.42]). However, regression-based had a fairly better calibration than the ML models. Our simulation study showed that ML and regression-based models are not equally robust to a variety of data analytic characteristics. LR models exhibited higher AUC in studies with a small/moderate set of predictors, while RF had about 15% higher discrimination studies with high dimensional set of predictors. ML models may be less accurate for predicting outcomes in studies with a few sets of predictors or when there is a large class imbalance in the data sets. Conclusions ML and regression-based algorithms are not equally sensitive to data analytic conditions, even though our data analysis revealed no significant differences between the former and the latter. ML might offer some discriminative advantages over the latter depending on the size and type of study predictors. We recommend that the choice between these classes of models should be guided by data characteristics, study design, and purpose for which the models are being developed.Item Open Access Optimally Linking Prehospital and Health System Data: The Association between Emergency Medical Services Offload Time, Response Time and Mortality(2020-01-14) Blanchard, Ian; Doig, Christopher James; Lang, Eddy S.; Dean, Stafford R.; Hagel, Brent Edward; Niven, Daniel J.; Williamson, Tyler S.INTRODUCTION: Delays in offloading Emergency Medical Services (EMS) patients in the hospital may impact timely response to emergencies, but no published studies are available. Little research has been conducted on the potential for bias when EMS data are linked to health system outcome and on the optimal EMS response time for survival of critically injured or ill patients. METHODS: Three years of EMS data from a large urban system were used to create hourly estimates of median hospital time and response time, and linked to health system outcome. Multivariable modelling and descriptive statistics were used to: 1. Explore the association between paramedic hospital time and response time while controlling for the effects of system volume, time of day, and season; 2. Describe the linkage rate between the standard strategy and one designed to optimize linkage; and 3. Explore the association between response time and mortality in critically injured or ill patients who did not experience an out-of-hospital cardiac arrest while controlling for age and sex. RESULTS: Depending on the time of day, there was between a one and three minute increase in predicted median response time when the system was experiencing a median hospital time of 90 minutes, during the winter in heavy system volume, compared to a 30 minute median hospital time, during the summer in light system volume. The optimized strategy increased the linkage rate from 88% to 97.1%, and reduced linkage failure in key clinical sub-groups. There was no significant association between response time and mortality except in one secondary analysis subgroup, which suggested longer response decreased mortality. CONCLUSIONS: There is an association between EMS hospital time and response time, but the relationship is complex and influenced by system level factors such as time of day, volume and season. An optimized strategy for linking EMS data to health system outcome improved the linkage rate and reduced the potential for bias. No consistent association between response time and mortality could be demonstrated. These analyses underscore the importance of research quality linked EMS data in the development of knowledge for EMS and paramedic practice.Item Open Access Predicting Death by Suicide with Administrative Health Care System Data(2020-05-27) Sanderson, Michael; Patten, Scott B.; Bulloch, Andrew G. M.; Wang, Jianli L.; Williamson, Tyler S.Quantifying suicide risk with risk scales is common in clinical practice, but the performance of risk scales has been shown to be limited. Prediction models have been developed to quantify suicide risk and have been shown to outperform risk scales, but these models have not been commonly adopted in clinical practice. The original research presented in this thesis as three manuscripts evaluates the performance of prediction models that quantify suicide risk developed with administrative health care system data. The first two manuscripts were designed to determine the most promising prediction model class and temporal data requirements. The modeling dataset contained 3548 persons that died by suicide and 35,480 persons that did not die by suicide between 2000 and 2016. 101 predictors were selected, and these were assembled for each of the 40 quarters prior to the quarter of death, resulting in 4040 predictors for each person. Logistic regression, feedforward neural network, recurrent neural network, one-dimensional convolutional neural network, and gradient boosted trees model classes were compared. The gradient boosted trees model class achieved the best performance and 8 quarters of data at most were required for optimal performance. The third manuscript applied the findings from the first two manuscripts to evaluate the performance of prediction models in a clinical setting. The prediction models quantified the risk of death by suicide within 90 days following an Emergency Department visit for parasuicide. The modeling dataset contained 268 persons that died by suicide and 33,426 persons that did not die by suicide between 2000 and 2017. The predictors were assembled for each of the 8 quarters prior to the quarter of death, resulting in 808 predictors for each person. Logistic regression and gradient boosted trees model classes were compared. The optimal gradient boosted trees model achieved promising discrimination and calibration. Following the manuscripts, this thesis discusses further research. At present, there is no clinical consensus on the preferred performance characteristics for quantifying suicide risk. The critical next step for further research is to discover the preferred performance characteristics for quantifying suicide risk and to discover whether the preferred performance characteristics can be achieved.Item Open Access Social Support in a Pregnant and Postnatal Population(2019-04-24) Hetherington, Erin Louise; Tough, Suzanne C.; McDonald, Sheila W.; Patten, Scott B.; Williamson, Tyler S.Background: Social support, in the form of emotional, informational and tangible resources provided by friends and family is beneficial for health. Social support in pregnancy and the postpartum period is also thought to improve birth outcomes and maternal mental health. However, questions remain as to what type of support is important and for which outcomes. In addition, little is known about patterns of support over time. Methods: A systematic review was conducted to determine the association between low social support and preterm birth. Data from the All Our Families cohort (n=3200) was used for the second two projects This cohort recruited women in pregnancy and followed them to 1 year postpartum, measuring demographic, psychosocial and birth outcome information. Multivariable binomial regression was used to estimate the impact of social support during pregnancy and in the early postpartum period on subsequent mental health symptoms. Group based trajectory modeling was used to determine patterns of support from pregnancy to four months postpartum, followed by multinomial regression to determine characteristics associated with different patterns of support. Results: The systematic review found no direct association between social support and preterm birth, however low social support was associated with preterm birth among women experiencing high stress. The second analysis revealed elevated risk of subsequent depression and anxiety symptoms among women with low support, across various levels of previous mental health risk. Finally, the trajectories analysis showed stable support among 98% of women. Stable high support (60% of women) was associated with higher income. Conclusion: Social support can impact both birth outcomes and maternal mental health, and is relatively stable for most women during pregnancy and postpartum. Interventions to improve support will have a larger absolute benefit for women who may be vulnerable due to previous mental health challenges. More research is needed to understand how to influence conditions that will allow women to develop and maintain strong support networks.Item Open Access The Effectiveness and Acceptability of Six-Month Isoniazid Preventive Therapy amongst People Living with HIV in KwaZulu-Natal, South Africa(2018-06-20) Boffa, Jody; Williamson, Tyler S.; Mayan, Maria J.; Fisher, Dina A.; Sauvé, Reg S.Tuberculosis (TB) preventive therapy is an integral part of global strategies to end TB. Isoniazid preventive therapy (IPT) is currently the only regimen recommended globally for low-resource settings with high burdens of TB and TB-HIV. In South Africa, where the incidence of TB and TB-HIV are among the highest in the world, health districts were quick to facilitate access to six-month IPT in the absence of active TB symptoms for all people living with HIV, amid numerous unknowns. My doctoral thesis responds to some of these unknowns; specifically, the effectiveness of IPT to reduce TB incidence and its acceptability in communities where latent TB infection was previously unfamiliar. The research occurred within a community-based participatory research framework including regular meetings with grassroots community advisory teams in three communities of uMgungundlovu District, KwaZulu-Natal. IPT effectiveness was evaluated utilising a retrospective cohort design, comparing TB incidence across two years among people receiving IPT alone, antiretroviral therapy (ART) alone, or IPT+ART to those without intervention. Acceptability was evaluated utilising the ethnographic method, including extensive field work, eight group interviews to learn about perspectives of TB infection, disease and IPT, and nine individual interviews with people accepting, discontinuing or declining IPT to learn about IPT experiences and decision making. Among those who completed the regimen, IPT significantly reduced the two-year TB incidence by 100% among women (97.5%CI=78-100%), with a less certain effect among men: IR=0.46, 95%CI=0-85%. IPT also appeared to provide additional prevention for people on ART. Nevertheless, IPT was interpreted by some as dangerous when the costs related to pill collection or consumption exacerbated poverty, the stigma associated with HIV and ART were conflated with its use, or it was seen as toxic. Clinical expectations of IPT initiation and adherence may also conflict with expectations of women in Zulu culture. Some women may initiate IPT to please the healthcare provider, rather than from a belief in preventive benefits. Taken together, findings suggest that IPT can reduce the risk of TB among people living with HIV, but may not be a high priority when economic and social needs compete.Item Open Access Using machine learning methods to improve chronic disease case definitions in primary care electronic medical records(2018-04-23) Lethebe, Brendan Cord; Williamson, Tyler S.; Sajobi, Tolulope T.; Quan, Hude; Ronksley, Paul EverettBackground: Chronic disease surveillance at the primary care level is becoming more feasible with the increased use of electronic medical records (EMRs). However, the quality of surveillance information is directly dependent on the quality of the case definitions that identify the conditions of interest. Purpose: To determine whether machine learning algorithms can produce chronic disease case definitions comparable to committee created case definitions in a primary care EMR setting. Methods: A chart review was conducted for the presence of hypertension, diabetes, osteoarthritis, and depression in a cohort of 1920 patients from the Canadian Primary Care Sentinel Surveillance Network database. The results of this chart review were used as training data. The C5.0, Classification and Regression Tree, Chi-Squared Automated Interaction Detection decision trees, Forward Stepwise logistic regression, Least Absolute Shrinkage and Selection Operator penalized logistic regression were compared using 10-fold cross validation. Sensitivity, specificity, positive predictive value and negative predictive value were estimated and compared for the four chronic conditions of interest. Results: Validity measures were similar across algorithms. For hypertension, sensitivity ranged between 93.1-96.7%, while specificity ranged from 88.8-93.2%. For diabetes, sensitivities ranged from 93.5-96.3% with specificities between 97.1-99.0%. For osteoarthritis, sensitivities ranged from 82.0-84.4% with specificities between 92.7-94.0%. For depression, sensitivities went from 81.4-88.3%, and specificities ranged from 93.4-94.9%. Compared with the committee-created case definitions, these metrics were equivalent or better using the machine learning method. Conclusions: Machine learning algorithms produced accurate case definitions comparable to committee-created case definitions. It is possible to use machine learning techniques to develop high quality case definitions from EMR data.Item Open Access Zoster prophylaxis after allogeneic hematopoietic cell transplantation using acyclovir/valacyclovir followed by vaccination(American Society of Hematology, 2016) Jamani, Kareem; MacDonald, Judy; Lavoie, Martin; Williamson, Tyler S.; Brown, Christopher B.; Chaudhry, Ahsan; Jimenez-Zepeda, Victor H.; Duggan, Peter; Tay, Jason; Stewart, Douglas; Daly, Andrew; Storek, JanVaricella zoster virus (VZV) disease (usually cutaneous zoster) occurs frequently after hematopoietic cell transplantation (HCT), and postherpetic neuralgia (PHN) results in poor quality of life. The optimal prophylaxis of VZV disease/PHN has not been established. At our center, before 2008, VZV prophylaxis consisted of ∼1 year of post-HCT acyclovir/valacyclovir (“old strategy”), whereas post-2008 prophylaxis consisted of 2 years of acyclovir/valacyclovir followed by immunization using varicella vaccine (“new strategy”). We performed a retrospective study comparing the cumulative incidence of VZV disease and PHN among patients who completed the old strategy (n = 153) vs the new strategy (n = 125). Patients who completed the old strategy had a significantly higher cumulative incidence of VZV disease (33% vs 17% at 5 years, P ≤ .01) and PHN (8% vs 0% at 5 years, P = .02). In conclusion, VZV prophylaxis with 2 years of acyclovir/valacyclovir followed by vaccination appears to result in a low incidence of VZV disease and may eliminate PHN.