Far, BehrouzSuman, Reeta2021-03-232021-03-232021-03-19Suman, R. (2021). An Approach to Server Log Analysis for Abnormal Behaviour Detection (Master's thesis, University of Calgary, Calgary, Canada). Retrieved from https://prism.ucalgary.ca.http://hdl.handle.net/1880/113169As the server logs increase in size, it becomes difficult for human experts to manually examine error log messages, analyze the anomalies, and because of the high volume of log data. If the error message is rare or of low frequency, the system does not categorize it as important and get ignored that may leads to fatal errors. Server log analytics has proven to be optimum for active strategies and excellent performances of the system like the preventive maintenance or complete shut-down. Improvements in analytical strategies are necessary for data analysts in handling the large system. For this analytical process to yield good results, the input data need to be of good quality; therefore, research focuses on cleaning and pre-processing techniques. This research proposes the consecutive logical steps to enhance the analysis of log messages. First, we purpose extracting sequences and patterns from the logs by optimizing window sizes without losing valuable information and combining them with forecasting techniques for predictive analytics. Second, we improve topic modelling for low frequency messages through text analysis and language modelling. The resulting proof of concept is not just visualizing the log data; instead, it provides insight into the logs through topics from the error messages. The experiments illustrate the effectiveness of the proposed steps and the approach for error log analysis.engUniversity of Calgary graduate students retain copyright ownership and moral rights for their thesis. You may use this material in any way that is permitted by the Copyright Act or through licensing that has been assigned to the document. For uses that are not allowable under copyright legislation or licensing, you are required to seek permission.LDA - Latent Dirichlet AllocationLSTM - Long Short-Term MemoryRMSE - Root Mean Square ErrorARIMA - Auto-Regressive Integrated Moving AverageEAI - Enterprise Application IntegrationPCA - Principal Component AnalysisEMS - Enterprise Management SystemTF-IDF - Term frequency-inverse document frequencyEngineeringAn Approach to Server Log Analysis for Abnormal Behaviour Detectionmaster thesis10.11575/PRISM/38686