Stamp's Students' Defenses: Fall 2017

Who	When	Where	Title
Naman Bagga	December 12 @ noon	MH 225	On the Effectiveness of Generic Malware Models
Armen Boursalian	December 12 @ 1:00pm	MH 225	Bootbandit: A macOS Bootloader Attack
Michael Crawford	December 12 @ 2:00pm	MH 225	Metamporphic Code Generation Using LLVM
Dhiviya Dhanasekar	December 4 @ 10:30am	DH 450	Detecting Encrypted Malware Using Hidden Markov Models
Sravani Yajamanam	December 7 @ noon	MH 233	Deep Learning for Image-Based Malware Classification

On the Effectiveness of Generic Malware Models

by Naman Bagga

Malware detection based on machine learning techniques is often treated as a problem specific to a particular malware family. In such cases, detection involves training and testing a model for each malware family. This approach can generally achieve high accuracy, but in a realistic setting we require a classification step for each possible family, which is likely to be slow, inefficient, and impractical. In contrast, classifying samples as malware or benign based on a single model would be far more efficient. However, such an approach is extremely challenging—extracting common features from a variety of malware families might result in a model that is too generic to be useful. In this research, we perform controlled experiments to determine the tradeoff between accuracy and the number of malware families modeled.

Bootbandit: A macOS Bootloader Attack

by Armen Boursalian

Full disk encryption (FDE) is used to protect a computer system against data theft by physical access. If a laptop or hard disk drive protected with FDE is stolen or lost, the data remains unreadable without the encryption key. To foil this defense, an intruder can gain physical access to a computer system in a so-called "evil maid" attack, install malware in the boot (pre-operating system) environment, and use the malware to intercept the victim's password. Such an attack relies on the fact that the system is in a vulnerable state before booting into the operating system. In this paper, we discuss an evil maid type of attach, in which the victim's password is stolen in the boot environment, passed to the macOS user environment, and then exfiltrated from the system to the attacker's remote command and control server. On a macOS system, this attack has additional implications due to "password forwarding" technology, whereby a users' account passwords also serve as FDE passwords.

Metamporphic Code Generation Using LLVM

by Michael Crawford

Each instance of metamorphic software changes its internal structure, but the function remains essentially the same. Such metamorphism has been used primarily by malware writers as a means of evading signature-based detection. However, metamorphism also has potential beneficial uses in fields related to software protection. In this research, we develop a practical framework within the LLVM compiler that automatically generates metamorphic code, where the user has well-defined control over the degree of morphing applied to the code. We analyze the effectiveness of this metamorphic generator based on hidden Markov model analysis.

Detecting Encrypted Malware Using Hidden Markov Models

by Dhiviya Dhanasekar

Encrypted code is often present in some types of advanced malware, while such code virtually never appears in legitimate applications. Hence, the presence of encrypted code within an executable file could serve as a strong heuristic for malware detection. In this research, we consider the feasibility of detecting encrypted code using hidden Markov models. Our results show that such a technique can be effective and practical.

Deep Learning for Image-Based Malware Classification

by Sravani Yajamanam

Image features known as "gist descriptors" have recently been applied to the malware classification problem. In this research, we implement, test, and analyze a malware score based on gist descriptors and verify that the resulting score yields strong classification results. We also analyze the robustness of this gist-based scoring technique when applied to obfuscated malware, and we perform feature reduction to determine a minimal set of gist features. Then we compare the effectiveness of a deep learning technique to this gist-based approach. While scoring based on gist descriptors is effective, we show that our deep learning approach performs equally well. For this problem, a potential advantage of deep learning is that there is no need to extract the gist features when training or scoring