Developing Novel Supervised Learning Model Evaluation Metrics Based on Case Difficulty

Lee, JoonKwon, Hyunjin2024-01-172024-01-172024-01-05Kwon, H. (2024). Developing novel supervised learning model evaluation metrics based on case difficulty (Master's thesis, University of Calgary, Calgary, Canada). Retrieved from https://prism.ucalgary.ca.https://hdl.handle.net/1880/11796110.11575/PRISM/42805Performance evaluation is an essential step in the development of machine learning models. The performance of a model informs the direction of its development and provides diverse knowledge to researchers. Most common ways to assess a model’s performance are based on counting the numbers of correct and incorrect predictions the model makes. However, this conventional approach to evaluation is limited in that it does not consider the differences in prediction difficulty between individual cases. Although metrics for quantifying the prediction difficulty of individual cases exist, their usefulness is hindered by the fact that they cannot be applied universally across all types of data; that is, each metric requires specific data conditions be met for its use, which can be a significant obstacle when dealing with real-world data characterized by diversity and complexity. Therefore, this thesis proposes new metrics for calculating case difficulty that perform well across diverse datasets. These new case difficulty metrics were developed using neural networks based on varying definitions of prediction difficulty. In addition, the metrics were validated using various datasets and compared with existing metrics from the literature. New performance evaluation metrics incorporating case difficulty to reward correct predictions of high-difficulty cases and penalize incorrect predictions of low-difficulty cases were also developed. A comparison of these case difficulty-based performance metrics with conventional performance metrics revealed that the novel evaluation metrics could provide a more detailed explanation and deeper understanding of model performance. We anticipate that researchers will be able to calculate case difficulty in diverse datasets under various data conditions with our proposed metrics and use these values to enhance their studies. Moreover, our newly developed evaluation metrics considering case difficulty could serve as an additional source of insight for the evaluation of classification model performance.enUniversity of Calgary graduate students retain copyright ownership and moral rights for their thesis. You may use this material in any way that is permitted by the Copyright Act or through licensing that has been assigned to the document. For uses that are not allowable under copyright legislation or licensing, you are required to seek permission.Machine LearningCase DifficultyBinary and Multiclass ClassificationArtificial IntelligenceComputer ScienceDeveloping Novel Supervised Learning Model Evaluation Metrics Based on Case Difficultymaster thesis