Speaker(s):

Uso de fuzzy hashing y aprendizaje profundo para detección de malware

Presentado en Data Day 2022

Today’s cybersecurity threats continue to find ways to fly and stay under the radar. Cybercriminals use polymorphic malware because a slight change in the binary code or script could allow the said threats to avoid detection by traditional antivirus software. Threat actors customize their wares specific to their target organizations to increase their chances of breaking into and moving laterally through an entire corporate network, exfiltrating data, and leaving with little or no trace. The underground economy is rife with malware builders, Trojanized versions of legitimate applications, and other tools and services that allow malware operators to deploy highly evasive malware. I propose a new approach that combines deep learning with fuzzy hashing. This approach utilizes fuzzy hashes as input to identify similarities among files and to determine if a sample is malicious or not. Then, a deep learning methodology inspired by natural language processing (NLP) better identifies similarities that actually matter, thus improving detection quality and scale of deployment.