6533b81ffe1ef96bd1277cb3

RESEARCH PRODUCT

Support vector machine integrated with game-theoretic approach and genetic algorithm for the detection and classification of malware

Timo HämäläinenMikhail Zolotukhin

subject

ta113Network securitybusiness.industryComputer scienceFeature vectorFeature extractionuhatBytecomputer.file_formatMachine learningcomputer.software_genrehaittaohjelmatSupport vector machineObfuscation (software)ComputingMethodologies_PATTERNRECOGNITIONnetworknetwork securityMalwareData miningArtificial intelligenceExecutabletietoturvabusinesscomputer

description

Abstract. —In the modern world, a rapid growth of mali- cious software production has become one of the most signifi- cant threats to the network security. Unfortunately, wides pread signature-based anti-malware strategies can not help to de tect malware unseen previously nor deal with code obfuscation te ch- niques employed by malware designers. In our study, the prob lem of malware detection and classification is solved by applyin g a data-mining-based approach that relies on supervised mach ine- learning. Executable files are presented in the form of byte a nd opcode sequences and n-gram models are employed to extract essential features from these sequences. Feature vectors o btained are classified with the help of support vector classifiers int egrated with a genetic algorithm used to select the most essential fe atures, and a game-theory approach is applied to combine the classifi ers together. The proposed algorithm, ZSGSVM, is tested by usin g a set of byte and opcode sequences obtained from a set containi ng executable files of benign software and malware. As a result, almost all malicious files are detected while the number of fa lse alarms remains very low. peerReviewed

https://doi.org/10.1109/glocomw.2013.6824988