A data mining framework for efficient discovery of classification rules

Barker, Kenneth E.Gopalan, Janaki2005-08-162005-08-162004Gopalan, J. (2004). A data mining framework for efficient discovery of classification rules (Master's thesis, University of Calgary, Calgary, Canada). Retrieved from https://prism.ucalgary.ca. doi:10.11575/PRISM/107850612976483http://hdl.handle.net/1880/41548Bibliography: p. 103-110Associative classification is an important research topic in data mining (DM). The thesis proposes a framework to derive accurate and interesting classification rules using the association rule mining (ARM) technique. To effectively address the rule discovery task, in the framework, two fundamental problems in the pre-processing and the post-processing components of the DM process are identified. In the preprocessing component, it is identified that the choice of the training set is an important factor in deriving good classification rules. The thesis proposes a novel technique using a genetic algorithm (GA) to find an appropriate split of a dataset into training and test sets. Using the obtained training set as the input to the ARM technique generates high accuracy classification rules. It is also identified that an algorithm (or heuristic) is required to find the best set of interesting and accurate rules from the discovered ones. In the post-processing component, the thesis proposes a pruning strategy using a GA to find the accurate interesting rules.xiv, 110 leaves : ill. ; 30 cm.engUniversity of Calgary graduate students retain copyright ownership and moral rights for their thesis. You may use this material in any way that is permitted by the Copyright Act or through licensing that has been assigned to the document. For uses that are not allowable under copyright legislation or licensing, you are required to seek permission.A data mining framework for efficient discovery of classification rulesmaster thesis10.11575/PRISM/10785AC1 .T484 2004 G67