— We present a hybrid data mining approach to detect malicious executables. In this approach we identify important features of the malicious and benign executables. These features are used by a classifier to learn a classification model that can distinguish between malicious and benign executables. We construct a novel combination of three different kinds of features: binary n-grams, assembly n-grams, and library function calls. Binary features are extracted from the binary executables, whereas assembly features are extracted from the disassembled executables. The function call features are extracted from the program headers. We also propose an efficient and scalable feature extraction technique. We apply our model on a large corpus of real benign and malicious executables. We extract the abovementioned features from the data and train a classifier using Support Vector Machine. This classifier achieves a very high accuracy and low false positive rate in detecting malicious executable...
Mohammad M. Masud, Latifur Khan, Bhavani M. Thurai