Applications of Text Mining Techniques on Automated Software System Verification

Date
2019-09
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract

Software Verification and Validation (V&V) is an essential task in software engineering. Software systems are becoming larger and more complex and as a result, a manual V&V process is getting more time-consuming and expensive. Text mining techniques enable the automatic characterization of the software system by extracting behavioural features of the software. Although text mining techniques have been leveraged in many domains of software verification for automation, there exist some unexplored domains that application of text mining techniques has not been investigated yet. This thesis consists of two separate manuscripts with a two-folded goal. In the first manuscript, I investigate two new and complex text mining heuristics namely GlobalBugLocator and GlobalDoc2Vec in automated fault localization, which is a well-known field of research in automated software verification. In the second manuscript, I investigate the application of a well-known text mining technique in automated Spatial Data Integration (SDI) which is a new and unexplored field in automated software verification. The results of the first manuscript show that GlobalBugLocator outperforms BugLocator (a state-of-the-art technique) with an average rate of 14% in terms of MRR (Mean Reciprocal Rank) and MAP (Mean Average Precision) in 64% and 54% of the cases, respectively. Also, GlobalBugLocator improved the mean performance of BugLocator with average rates of 6.6% and 4.8% in terms of MRR and MAP when applied to 51 software projects. This amount of improvement is significant compared to the improvement rates of other state-of-the-art methods. However, a complex Word Embedding solution (GlobalDoc2Vec) is not always effective and in some cases with adding too much complexity, decreases the performance of the simpler methods. In the second study, which is conducted in collaboration with industry, the results indicate a significant performance of the studied text mining technique in SDI automation with precision and recall values of more than 95% in four real-world experiments.

Description
Keywords
Software System Verification, Text Mining, Automated Geospatial Data Verification, Automated Fault Localization
Citation
Miryeganeh, S. N. (2019). Applications of Text Mining Techniques on Automated Software System Verification (Master's thesis, University of Calgary, Calgary, Canada). Retrieved from https://prism.ucalgary.ca.