On Some New Variable Selection Methods for Multivariate Survival Data

Date
2023-08
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
This dissertation proposes variable selection methods for reducing dimensionality in complex lifetime data for survival analysis. With the advent of big data, survival analysis often involves a large number of covariates, necessitating their identification. High-dimensional data, especially with increasing sample size, presents challenges in terms of variable selection. The dissertation focuses on simultaneous estimation and variable selection methods under various censored data types and survival models, examining their theoretical properties and performance in finite samples. The analysis of complex lifetime data encounters challenges stemming from different sources, including various types of censoring, diverse models, and multiple outcomes. Traditional survival analysis primarily deals with univariate survival data, focusing on a single event of interest. However, real-world applications frequently involve multiple event types with distinct underlying causes and risk factors. This research investigates three types of multiple events data: competing risks, semi-competing risks, and multivariate failure time data. For competing risks data, Chapter 2 considers interval-censored models. A penalized variable selection method is proposed, utilizing the LASSO, Adaptive LASSO, and broken adaptive ridge regression. The proposed method effectively selects important variables based on results of simulation studies. It is also successfully applied to a real-life HIV study dataset. In the context of semi-competing risks data, Chapter 3 explores an illness-death model with shared frailty. Parametric and semiparametric models are employed to examine the effects of covariates and conduct variable selection. The proposed method demonstrates good performance through simulation studies and analysis of colon cancer data. For multivariate failure time data, Chapter 4 introduces the sparse group broken adaptive ridge (SGBAR) penalty. This penalty facilitates variable selection at both the individual and group levels and is applied to interval-censored data. Extensive simulation studies confirm the good performance of the method, and the method is further validated using real-life data from the Aerobic Center Longitudinal Study (ACLS). In summary, this dissertation proposes new variable selection methods for complex lifetime data. It addresses challenges associated with competing risks, semi-competing risks, and multivariate failure time data. The proposed methods are supported by theoretical analysis, simulation studies, and real-life applications.
Description
Keywords
Penalized Variable Selection, Competing Risks Data, Semi-competing Risks Data, Multivariate Failure Time Data, Lifetime Data Analysis
Citation
Mahmoudi, F. (2023). On some new variable selection methods for multivariate survival data (Doctoral thesis, University of Calgary, Calgary, Canada). Retrieved from https://prism.ucalgary.ca.