Intrusion Detection Using Heterogeneous Data Sources

Date
2024-01-18
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Amidst the growing sophistication of cyber-attacks and malware, conventional Intrusion Detection Systems (IDS) often fall short, primarily due to their reliance on single data sources, such as Network-based (NIDS) or Host-based Intrusion Detection Systems (HIDS). These systems tend to miss a comprehensive view of network activities, as highlighted in existing literature. Recent research efforts have attempted to integrate multiple heterogeneous data sources, yet often treat each data source in isolation, thereby overlooking the complex interrelations that exist among various data sources within the same network. This thesis introduces IMD-IDS, which stands apart by its ability to fuse multiple heterogeneous data sources effectively for anomaly detection. The centrepiece of IMD-IDS is a machine learning (ML) based detection engine trained concurrently on all available data sources, whether heterogeneous or not. This approach enables IMD-IDS to uncover and understand the intricate relationships between different data sources. To achiece this, a novel fusion algorithm is presented, leveraging BERT encoders to convert textual host data into numerical vectors. These vectors are then integrated with feature vectors derived from network data, forming a rich, combined dataset. The XGBoost model, employed within IMD-IDS, utilizes this unified dataset to enhance anomaly detection accuracy, benefiting from simultaneous access to diverse data sources. Through experimental validation, this thesis demonstrates that IMD-IDS achieves superior performance compared to previous multi-datasource IDS approaches, particularly in detecting both known and zero-day attacks. The results show an average performance improvement of 12\% and 10\%, respectively, for these attack types.
Description
Keywords
Network Security, Intrusion Detection, Word Embedding, Applied Machine Learning, Multi Data Source
Citation
Abhari, B. (2024). Intrusion detection using heterogeneous data sources (Master's thesis, University of Calgary, Calgary, Canada). Retrieved from https://prism.ucalgary.ca.