Naman Bagga | On the Effectiveness of Generic Malware Models | ||
Armen Boursalian | Bootbandit: A macOS Bootloader Attack | ||
Michael Crawford | Metamporphic Code Generation Using LLVM | ||
Dhiviya Dhanasekar | Detecting Encrypted Malware Using Hidden Markov Models | ||
Sravani Yajamanam | Deep Learning for Image-Based Malware Classification |
Malware detection based on machine learning
techniques is often treated as a problem
specific to a particular malware family. In such cases, detection
involves training and testing a model for each malware family.
This approach can generally achieve high accuracy, but in a realistic setting
we require a classification step for each possible family, which is likely to
be slow, inefficient, and impractical.
In contrast, classifying samples as malware or benign based on a single
model would be far more efficient. However, such an approach is
extremely challenging—extracting common features from
a variety of malware families might result in a model that is too generic to be useful.
In this research, we perform controlled experiments to determine the tradeoff
between accuracy and the number of malware families modeled.
Full disk encryption (FDE) is used to protect a computer system against
data theft by physical access. If a laptop or hard disk drive protected with
FDE is stolen or lost, the data remains unreadable without the encryption key.
To foil this defense, an intruder can gain physical access to a computer system
in a so-called "evil maid" attack, install malware in the boot (pre-operating system)
environment, and use the malware to intercept the victim's password.
Such an attack relies on the fact that the system is in a vulnerable state
before booting into the operating system. In this paper, we discuss an evil maid
type of attach, in which the victim's password is stolen in the boot environment,
passed to the macOS user environment, and then exfiltrated from the system to
the attacker's remote command and control server. On a macOS system, this attack
has additional implications due to "password forwarding" technology,
whereby a users' account passwords also serve as FDE passwords.
Each instance of metamorphic software changes its internal structure,
but the function remains essentially the same. Such metamorphism has been
used primarily by malware writers as a means of evading signature-based
detection. However, metamorphism also has potential beneficial uses in fields
related to software protection. In this research, we develop a practical
framework within the LLVM compiler that automatically generates metamorphic code,
where the user has well-defined control over the degree of morphing applied
to the code. We analyze the effectiveness of this metamorphic generator
based on hidden Markov model analysis.
Encrypted code is often present in some types of advanced malware, while
such code virtually never appears in legitimate applications. Hence,
the presence of encrypted code within an executable file
could serve as a strong heuristic for malware detection.
In this research, we consider the feasibility of detecting
encrypted code using hidden Markov models. Our results show
that such a technique can be effective and practical.
Image features known as "gist descriptors" have recently been applied
to the malware classification problem. In this research, we implement,
test, and analyze a malware score based on gist descriptors
and verify that the resulting score yields strong classification results.
We also analyze the robustness of this gist-based scoring technique when
applied to obfuscated malware, and we perform feature reduction to
determine a minimal set of gist features. Then we compare the effectiveness
of a deep learning technique to this gist-based approach. While scoring based
on gist descriptors is effective, we show that our deep learning approach
performs equally well. For this problem, a potential advantage of deep learning
is that there is no need to extract the gist features when training or scoring