Explainable malware detection using self-attention

Author : Gabriel Kožuch
Contact : kozuch8@uniba.sk
Supervisor : Mgr. Iveta Bečková, PhD.
Consultant : Mgr. Štefan Pócoš, PhD.
GitHub repository : explainable-attention
Progressive version of the work : diplomovka

Assignment

The goal of this thesis is to use transformer-like architecture with self-attention for the task of malware detection with the focus on comparing multiple different data representations and their effect on final accuracy and explainability.

Anontation

Explainability is sought after in many machine learning domains. The same goes for malware detection, to achieve extra information about why a certain sample is (or is not) malware. One important factors affecting success of machine learning methods is quality of the input data, preferring compact feature representation that capture as much relevant information as possible. In this direction, a huge advantage of transformer-like architectures is the fact that they can process inputs of various sizes. Moreover, they use self-attention during inference, which can be visualized, providing certain level of inherent explainability.

Literature

A. Vaswani et al., Attention is all you need. in: Advances in neural information processing systems 30 (2017).
H. S. Anderson, P. Roth, EMBER: an open dataset for training static PE malware machine learning models, 2018. arXiv:1804.04637.
P. Švec, Š. Balogh, M. Homola, J. Kľuka, T. Bisták, Semantic data representation for explainable Windows malware detection models, arXiv preprint arXiv:2403.11669 (2024).
Set Transformer: A Framework for Attention-based Permutation-Invariant Neural Networks TabTransformer: Tabular Data Modeling Using Contextual Embeddings

Explainable malware detection using self-attention

Assignment

Anontation

Literature

Log