QDA Classification for Two-Component Mixture with Data of Rare and Weak Signal

dc.contributor.advisorWu, Jingjing
dc.contributor.advisorShen, Hua
dc.contributor.authorChen, Hanning
dc.contributor.committeememberde Leon, Alexander R.
dc.contributor.committeememberLiu, Shawn X.
dc.date2020-02
dc.date.accessioned2020-01-14T20:59:39Z
dc.date.available2020-01-14T20:59:39Z
dc.date.issued2019-12-20
dc.description.abstractThis thesis deals with the two-class classification problem for data with rare and weak signals, under the modern setup of p >> n (large p small n). Considering the two-component mixture of Gaussian features with different random mean vector of rare and weak signals but common covariance matrix (homoscedastic Gaussian), Fan et al. (2013) discussed the optimality of linear discriminant analysis (LDA) and proposed an efficient variable selection and classification procedure. This thesis is an extension of their work in the sense that we assume the two components have different random covariance matrix (heterogenous Gaussian) of rare and weak signals. As a start of this research, for simplicity we assume the two population mean vectors are the same in order to assess the pure effect of different covariance matrix. In this thesis, we propose intuitively to use quadratic discriminant analysis (QDA) for the classification of data with rare and weak signals. In theoretical aspect, we first derive the detection boundary of QDA at population level, which separates the region of successful classification from the region of unsuccessful classification under the ideal case that the covariance matrix is known. When the covariance matrix is unknown, we then obtain a subregion where successful classification is impossible (for all classifiers) which also forms a subregion of unsuccessful classification region of QDA. For data of rare signals, variable selection will mostly improve the performance of statistical procedures. Thus in implementation aspect, we propose a variable selection procedure for QDA based on the Higher Criticism Thresholding (HCT) that was proved to be efficient for LDA in Fan et al. (2013). Finally, we conduct extensive simulation studies in order to demonstrate and explore the successful and unsuccessful classification regions of QDA and examine the effectiveness of the proposed HCT procedure.en_US
dc.identifier.citationChen, H. (2019). QDA Classification for Two-Component Mixture with Data of Rare and Weak Signal (Master's thesis, University of Calgary, Calgary, Canada). Retrieved from https://prism.ucalgary.ca.en_US
dc.identifier.doihttp://dx.doi.org/10.11575/PRISM/37452
dc.identifier.urihttp://hdl.handle.net/1880/111495
dc.language.isoengen_US
dc.publisher.facultyScienceen_US
dc.publisher.institutionUniversity of Calgaryen
dc.rightsUniversity of Calgary graduate students retain copyright ownership and moral rights for their thesis. You may use this material in any way that is permitted by the Copyright Act or through licensing that has been assigned to the document. For uses that are not allowable under copyright legislation or licensing, you are required to seek permission.en_US
dc.subjecthigh dimensional dataen_US
dc.subjectquadratic discriminant analysis (QDA)en_US
dc.subjecthigher criticismen_US
dc.subjectclassificationen_US
dc.subject.classificationEducation--Sciencesen_US
dc.titleQDA Classification for Two-Component Mixture with Data of Rare and Weak Signalen_US
dc.typemaster thesisen_US
thesis.degree.disciplineMathematics & Statisticsen_US
thesis.degree.grantorUniversity of Calgaryen_US
thesis.degree.nameMaster of Science (MSc)en_US
ucalgary.item.requestcopytrueen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
ucalgary_2019_chen_hanning.pdf
Size:
473.03 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
2.62 KB
Format:
Item-specific license agreed upon to submission
Description: