Walker, Robert JamesKianifar, Mohammad Reza2024-01-242024-01-242024-01-23Kianifar, M. R. (2024). Graph generalization for software engineering (Master's thesis, University of Calgary, Calgary, Canada). Retrieved from https://prism.ucalgary.ca.https://hdl.handle.net/1880/11806310.11575/PRISM/42907Graph generalization is a powerful concept with a wide range of potential applications, not only within software engineering but also across various domains. While established algorithms exist for generalizing simple graphs, such as trees, the development of practical methods for applying generalization techniques to more complex graphs remains a critical challenge. In this thesis, we introduce a novel formal model and algorithm, referred to as GGA (Graph Generalization Algorithm), dedicated to generalizing labelled directed graphs. We evaluate GGA by focusing on key aspects including its information preservation relative to its input graphs, its scalability in execution, and for three applications each utilizing differing kinds of graph: (1) abstract syntax trees (ASTs); (2) class graphs; and (3) call graphs. In the first case, GGA is compared against ASGard and Diff-Sitter, two existing approaches for tree-based generalization; in the latter two cases, GGA is compared against Diff and CodeMetrics. Our findings reveal GGA's superiority over the alternatives. In the AST application, GGA outperforms ASGard by an average of 5-18% on metrics related to information preservation. GGA's results for the AST differencing context also matched 100% with Diff-Sitter by applying symmetrical filtering to skip strictness configuration. In the context of class graphs, GGA achieves 77.1% in precision@5, while in the case of call graphs, it exhibits 60% in precision@5. We also performed an extensive performance test for the first two applications, and the result shows that GGA's execution time scales linearly with respect to the product of vertex count and edge count. Our research not only introduces a novel algorithm for graph generalization but also demonstrates its ability to preserve information in diverse applications while performing efficiently. These results signify the potential of GGA to advance the field of graph generalization and its practical applicability across various domains, specifically in software engineering.enUniversity of Calgary graduate students retain copyright ownership and moral rights for their thesis. You may use this material in any way that is permitted by the Copyright Act or through licensing that has been assigned to the document. For uses that are not allowable under copyright legislation or licensing, you are required to seek permission.Graph GeneralizationSoftware EngineeringEducation--SciencesGraph Generalization for Software Engineeringmaster thesis