A Comparative Study of Machine Learning Algorithms for Real-time Detection of Windows Portable Executable (PE) Malware




Abstract:
Protection against zero-day and polymorphic malware, particularly those targeting the Windows Portable Executable (PE) format, requires detection mechanisms that are both highly accurate and capable of real-time operation. Traditional approaches, including signature-based detection and computationally intensive dynamic analysis, struggle to meet the strict sub-second latency requirements of modern endpoint protection systems, limiting their effectiveness against evolving threats. This study evaluates the performance and practical applicability of tree-based ensemble models, Random Forest, XGBoost, and LightGBM, for static malware detection in PE files. The proposed framework employs a zero-execution pipeline, extracting metadata, section entropy, and Import Address Table (IAT) configurations, while applying Information Gain (IG) and Principal Component Analysis (PCA) to reduce computational overhead. Experimental results on benchmark datasets show that tree-based ensembles outperform deep learning models, such as Multilayer Perceptrons (MLPs) and Convolutional Neural Networks (CNNs), as well as traditional machine learning approaches in handling high-dimensional tabular data. While XGBoost achieves the highest classification accuracy of 99.68%, LightGBM demonstrates superior overall operational performance. Its efficiency, enabled by leaf-wise tree growth and histogram-based optimization, ensures low-latency inference and reduced memory usage. These properties make it well-suited for Endpoint Detection and Response (EDR) systems, where real-time performance is essential. The findings also highlight the importance of integrating Explainable AI (XAI) and advanced training strategies to improve robustness against increasingly sophisticated evasion techniques.

CITATION:

IEEE format

A. Sandro Cvetković, S. Adamović, M. Šarac, “A Comparative Study of Machine Learning Algorithms for Real-time Detection of Windows Portable Executable (PE) Malware,” in Sinteza 2025 - International Scientific Conference on Information Technology, Computer Science, and Data Science, Belgrade, Singidunum University, Serbia, 2026, pp. 144-150. doi:10.15308/Sinteza-2026-144-150

APA format

Sandro Cvetković, A., Adamović, S., Šarac, M. (2026). A Comparative Study of Machine Learning Algorithms for Real-time Detection of Windows Portable Executable (PE) Malware. Paper presented at Sinteza 2025 - International Scientific Conference on Information Technology, Computer Science, and Data Science. doi:10.15308/Sinteza-2026-144-150

BibTeX format
Download

RefWorks Tagged format
Download