Content Based Spam Classification- A Deep Learning Approach

Date
2016
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
In this thesis, we apply Stacked Denoising Autoencoder (SDAE), a major type of deep learning networks, to spam detection. We comprehensively compare its performances with other prevalent deep learning techniques, Deep Belief Network (DBN) and Dense Multi Layer Perceptron (Dense- MLP), which can be further applied to spam filtering. We further compared the performance of these deep learning technologies against state-of-the-art, Support Vector Machines (SVM). Experiments were conducted on five benchmark corpora, namely PU1, PU2, PU3, PUA, and Enron- Spam. Accuracy, Precision, Recall and F1 measure are selected as the main criteria in analysing and discussing the results. Experimental results verify the efficacy of deep learning approaches with application in spam filtering in the real world. This project is part of the larger research in deep learning being conducted by the Wedge Networks- a leading technology vendor in security services. Wedge Networks is actively doing great work in the different realms of securing data on the internet and the cloud- including spam classification, web-page classification, and virus detection among others. This study delves deeper into applying deep learning to spam classification in particular and checks its credibility against the state-of-the-art algorithm- SVM.
Description
Keywords
Artificial Intelligence, Computer Science
Citation
Tyagi, A. (2016). Content Based Spam Classification- A Deep Learning Approach (Master's thesis, University of Calgary, Calgary, Canada). Retrieved from https://prism.ucalgary.ca. doi:10.11575/PRISM/25435