A data mining framework for efficient discovery of classification rules
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Associative classification is an important research topic in data mining (DM). The thesis proposes a framework to derive accurate and interesting classification rules using the association rule mining (ARM) technique. To effectively address the rule discovery task, in the framework, two fundamental problems in the pre-processing and the post-processing components of the DM process are identified. In the preĀprocessing component, it is identified that the choice of the training set is an imporĀtant factor in deriving good classification rules. The thesis proposes a novel technique using a genetic algorithm (GA) to find an appropriate split of a dataset into training and test sets. Using the obtained training set as the input to the ARM technique generates high accuracy classification rules. It is also identified that an algorithm (or heuristic) is required to find the best set of interesting and accurate rules from the discovered ones. In the post-processing component, the thesis proposes a pruning strategy using a GA to find the accurate interesting rules.