Automated Software Testing of Deep Neural Network Programs

Vahdat Pour, Maryam

Automated Software Testing of Deep Neural Network Programs

dc.contributor.advisor	Hemmati, Hadi
dc.contributor.author	Vahdat Pour, Maryam
dc.contributor.committeemember	Behjat, Laleh
dc.contributor.committeemember	Far, Behrouz Homayoun
dc.date	2020-09
dc.date.accessioned	2020-09-28T14:17:30Z
dc.date.available	2020-09-28T14:17:30Z
dc.date.issued	2020-09-23
dc.description.abstract	Machine Learning (ML) models play an essential role in various applications. Specifically, in recent years, Deep neural networks (DNN) are leveraged in a wide range of application domains. Given such growing applications, DNN models' faults can raise concerns about its trustworthiness and may cause substantial losses. Therefore, detecting erroneous behaviours in any machine learning system, specially DNNs is critical. Software testing is a widely used mechanism to detect faults. However, since the exact output of most DNN models is not known for a given input data, traditional software testing techniques cannot be directly applied. In the last few years, several papers have proposed testing techniques and adequacy criteria for testing DNNs. This thesis studies three types of DNN testing techniques, using text and image input data. In the first technique, I use Multi Implementation Testing (MIT) to generate a test oracle for finding faulty DNN models. In the second experiment, I compare the best adequacy metrics from the coverage-based criteria (Surprise Adequacy) and the best example from mutation-based criteria (DeepMutation) in terms of their effectiveness for detecting adversarial examples. Finally, in the last experiment, I applied three different test generation techniques (including a novel technique) to the DNN models and compared their performance if the generated test data are used to re-train the models. The first experiment results indicate that using MIT as a test oracle can successfully detect the faulty programs. In the second study, the results indicate that although the mutation-based metric can show better performance in some experiments, it is sensitive to its parameters and requires hyper-parameter tuning. Finally, the last experiment shows a 17% improvement in terms of F1-score, when using the proposed approach in this thesis compared to the original models from the literature.	en_US
dc.identifier.citation	Vahdat Pour, M. (2020). Automated Software Testing of Deep Neural Network Programs (Master's thesis, University of Calgary, Calgary, Canada). Retrieved from https://prism.ucalgary.ca.	en_US
dc.identifier.doi	http://dx.doi.org/10.11575/PRISM/38260
dc.identifier.uri	http://hdl.handle.net/1880/112601
dc.language.iso	eng	en_US
dc.publisher.faculty	Schulich School of Engineering	en_US
dc.publisher.institution	University of Calgary	en
dc.rights	University of Calgary graduate students retain copyright ownership and moral rights for their thesis. You may use this material in any way that is permitted by the Copyright Act or through licensing that has been assigned to the document. For uses that are not allowable under copyright legislation or licensing, you are required to seek permission.	en_US
dc.subject	Deep Neural Network	en_US
dc.subject	Testing DNN models	en_US
dc.subject	Multi-implementation testing	en_US
dc.subject	Guided Mutation	en_US
dc.subject	Test case generation	en_US
dc.subject.classification	Engineering	en_US
dc.title	Automated Software Testing of Deep Neural Network Programs	en_US
dc.type	master thesis	en_US
thesis.degree.discipline	Engineering – Electrical & Computer	en_US
thesis.degree.grantor	University of Calgary	en_US
thesis.degree.name	Master of Science (MSc)	en_US
ucalgary.item.requestcopy	true	en_US