Browsing by Author "Deardon, Robert"
Now showing 1 - 5 of 5
Results Per Page
Sort Options
Item Open Access Bayesian Variable Selection Model with Semicontinuous Response(2022-01-14) Babatunde, Samuel; Chekouo, Thierry; Sajobi, Tolulope; Zhang, Qingrun; Deardon, Robert; Bezdek, KarolyWe propose a novel Bayesian variable selection approach that identifies a set of features associated with a semicontinuous response. We used a two-part model where one of the models is a logit model that estimates the probability of zero responses while the other model is a log-normal model that estimates responses greater than zero (positive values). Stochastic Search Variable Selection (SSVS) procedure is used to randomly sample the indicator variables for variable selection which in turn searches the space of feature subsets and identifies the most promising features in the model. For the logistic model, a data augmentation approach is used to sample from the posterior density. We impose a spike-and-slab prior for the regression effects where the unselected covariates take on a prior mass at zero while the selected covariates follow a normal distribution (including the intercept and clinical covariates). Since the joint posterior density had no closed form, we employed the techniques of the Markov Chain Monte Carlo (MCMC) to sample from the posterior distribution. Simulation studies are used to assess the performance of the proposed method. We computed the average area under the receiver operating characteristic curve (AUC) to assess variable selection and compared it with competing methods. We also assessed the convergence diagnosis of our MCMC algorithm by computing the potential scale reduction factor and correlations between the marginal posterior probabilities. We finally apply our method to the coronary artery disease (CAD) data where the aim is to select important genes associated with the CAD index. This data consists of clinical covariates and gene expressions.Item Open Access Causal Inference With Non-probability Sample and Misclassified Covariate(2022-09) Sevinc, Emir; Shen, Hua; Lu, Xuewen; Deardon, Robert; Shen, Hua; Badescu, AlexandruCausal inference refers to the study of analyzing data that is explicitly defined on a question of causality. The problems motivating many, if not most studies in social and biological sciences, tend to be causative and not associative. A well defined and systematically representative sample tends to be the base in such studies. However, sometimes a sample may result from a non-probability process. This often provides a unique challenge in estimating the probability of an individual being in the sample, and generalizing the causality conclusions made off of the non-probability samples to the target population. Additionally, due to issues such as difficulty of precise measurements and human error, certain variables may be classified incorrectly. In this thesis, we address both challenges by implementing causal inferential methods in a case where we have a main non-probability sample with response available, and a probability sample with auxiliary information only. We deal with the presence of incorrectly classified confounder in the non-probability sample only, or both samples. We examine the consequences of naively ignoring misclassification, and develop a latent-variable based method via an Expectation-Maximization algorithm to correct for the misclassified confounder. We incorporate this method with a double-robust mean estimator requiring only the correct specification of either the regression model or the non-probability sample selection model to estimate the average treatment effect. We demonstrate the effectiveness of our methodology via simulation studies, and implement it on smoking data from the Centre of Disease Control and Prevention (CDC).Item Open Access Competing Risk Analysis with Misclassified Covariates(2020-09-25) Li, Ruoyu; Shen, Hua; Lu, Xuewen; Deardon, RobertMisclassification in categorical variables and missing data can often occur concurrently in medical research. Though there has been extensive research on either topic, relatively little work is available to address both issues simultaneously, especially in survival analysis. In this thesis, we first propose a method for the competing risk analysis involving a latent categorical covariate where validation data is absent and the latent variable of interest is only measured subject to misclassification via a set of surrogate variables. We then extend it to a more general setting where the latent covariate is not measured by the same number of surrogate variables for all subjects.For example, the decision to be measured by additional surrogate variable depends on the available faulty measurements of the latent variable by preceding surrogate variables resulting in a sequential missing pattern among the surrogates. In both cases we apply direct approach in the analysis of competing risks focusing on the cumulative incidence functions of the event of interest and its competing events and adopt flexible parametric forms for the baseline cumulative incidence functions. We develop likelihood-based methods based on expectation-maximization algorithms and jointly model the competing risks, surrogate variables and latent covariate of interest. The procedures simultaneously allow estimation of the covariate effects on the event of interest, parameters in the baseline cumulative incidence functions, regression coefficients in the misclassification model and association between the latent covariate and other completely and precisely observed covariates. We evaluate the empirical performance of the proposed methods in simulation studies. We conclude that they outperform the naive and ad hoc approaches in both cases and are relatively robust to sample size, misclassification rate and missing proportion of the surrogate variables. Finally, we apply the proposed method to the stimulating study on breast cancer. Discussion and future work are outlined in the end.Item Open Access Investigating the Risk of Primary Invasive Cancer Among Individuals With a History of Bacterial Sexually Transmitted Infections: A Population-Based Study in Alberta, Canada(2022-11-22) Qureshi, Hina M; Fidler-Benaoudia, Miranda M; Bobawsky, Kirsten M; Deardon, Robert; Kassam AliyaBackground: The number of new cancer cases is rising in Canada. Incidence rates of chlamydia, gonorrhea, and syphilis have also increased over the last two decades. However, few studies have explored the relationship between these bacterial sexually transmitted infections (STIs) and cancer risk.Purpose: To investigate the risk of primary invasive cancer among individuals with a prior reported diagnosis of chlamydia, gonorrhea, and/or syphilis in Alberta, Canada, from 2000-2019. Methods: This population-based, data-linkage, retrospective cohort study explored seven exposure categories based on prior records of three bacterial STIs. We investigated outcomes for (a) cancers overall, including all cancer sites, and (b) thirteen individual cancer regions. A cohort comprising 175,024 Albertan residents with a reported diagnosis of chlamydia, gonorrhea, and/or syphilis from January 1, 2000, to December 31, 2019, were identified and followed until the development of invasive cancer or death or study end date, whichever happened first. The cancer incidence rate in the STI cohort was compared with the cancer incidence rate in the general population of Alberta using standardized incidence ratios (SIR) and absolute excess risk (AER) with 95% confidence intervals (CI).Results: We identified 1,593 subsequent first primary invasive cancers and in-situ bladder carcinomas during the 1,385,580 person-years of observation (median follow-up=7.3 years; IQR=3.4-11.9). We did not find increased or decreased risk for cancers overall associated with any of the seven exposure categories. However, an increase in risk was identified for cancers of female genital organs (SIR=1.4, 95%CI=1.2,1.6; AER=0.6, 95%CI=0.3,0.9) and male genital organs (SIR=1.5, 95%CI=1.2,1.8; AER=1.1, 95%CI=0.5,1.8) among those exposed to chlamydia only; cancers of digestive organs in case of exposure to gonorrhea only (both sexes combined: SIR=1.4, 95%CI=1.02,2.0; AER=1.6, 95%CI=0.0,3.2); cancers of respiratory and intrathoracic organs (both sexes combined: SIR=1.7, 95%CI=1.1,2.5; AER=3.8, 95%CI=0.2,7.3), as well as cancers of hematopoietic and reticuloendothelial system (both sexes combined: SIR=1.6, 95%CI=1.02,2.3; AER=3.3, 95%CI=-0.3,6.8) in case of syphilis only exposure; and cancers of female genital organs among those exposed to chlamydia and gonorrhea (SIR=1.9, 95%CI=1.2,3.0; AER=1.2, 95%CI=0.1,2.2). Conclusion: Overall findings of this study suggest a possible role of common bacterial STIs in carcinogenesis.Item Open Access Rapid Large-Scale Inference of Genome-Wide Mutational Heterogeniety(2016) Mathankeri, Aaron; de Koning, A.P. Jason; Chan, Jennifer; Deardon, RobertTumours arise by mutation and natural selection among cellular lineages. Understanding and modelling mutation is thus a central aspect of cancer research. Genes that confer a selective advantage to their cell-line when mutated are known as drivers and are usually identified by statistical enrichment of mutations. Current approaches to detect drivers make several simplifying assumptions, sacrificing biological realism for computational speed when modelling mutation. The main novel, technical contribution of this thesis is the presentation of a principled mathematical framework for mutational analysis in genomic data that we term ``Mut-HMM''. Calculations required for large-scale inference were parallelized to take advantage of many-core CPU clusters. Based on this work, I present a new software package that can be orders of magnitude faster than previous state-of-the-art methods for analysis of genome-wide mutation patterns. I then present an exploratory analysis of chromosome 22 germline mutation data, showing that the results highlight the need for more complex and sophisticated mutation models in cancer and human genomics.