Lolitha Sresta Tupadha | Machine learning to detect malware evolution | ||
Han-Chih Chang | Keystroke dynamics based on machine learning | ||
Jianwei Li | Keystroke dynamics for user authentication with fixed and free text | ||
Rakesh Nagaraju | Malware analysis with auxiliary-classifier GAN | ||
Xinxin Yang | Computer-aided diagnosis of low grade endometrial stromal sarcoma (LGESS) | ||
Ruchira Gothankar | Clickbait detection in YouTube videos | ||
Shamli Singh | Hidden Markov model-based clustering for malware classification | ||
Aditi Walia | Data augmentation of malware images | ||
Deanne Charan | Classic cryptanalysis with GANs |
Malware evolves over time and antivirus must adapt to such evolution.
Hence, it is critical to detect those points in time where malware has evolved
so that appropriate countermeasures can be undertaken. In this research,
we perform a variety of experiments on a significant number of malware families
to determine when malware evolution
is likely to have occurred. All of the evolution detection techniques that
we consider are based on machine learning and can be fully automated—in
particular, no reverse engineering or other labor-intensive manual analysis
is required. Specifically, we consider analysis based on hidden Markov models
and various word embedding techniques, including Word2Vec and HMM2Vec.
The development of active and passive biometric authentication
and identification
technology plays an increasingly important role in cybersecurity.
Biometrics that utilize
features derived from keystroke dynamics have been studied in this context.
Keystroke dynamics can be
used to analyze the way that a user types by monitoring various keyboard input.
Previous work has considered the feasibility of user authentication and classification
based on keystroke dynamics. In this research, we analyze a wide variety of machine
learning and deep learning models based on keystroke-derived features,
we optimize the resulting models, and we compare our results to those
obtained in related research. We find that a model that combines a
convolutional neural network (CNN) and a gated recurrent unit (GRU)
preforms best in our experiments. This model also outperforms previous
research in this field.
In this research, we consider the problem of verifying user identity based on
keystroke dynamics obtained from fixed text or free text.
For fixed-text typing behavior, multiple machine learning and deep learning
methods are employed, with XGBoost and Multi-Layer Perceptron (MLP)
performing best. For free-text typing, we employ a novel feature engineering
method that generates image-like transition matrices. For these image-like features,
a convolution neural network (CNN) with cutout achieves results
that are competitive with previous work. A hybrid model
consisting of a CNN and a recurrent neural network (RNN) is
shown to outperform previous research for free-text typing.
Generative adversarial networks (GAN) are a class of powerful
machine learning techniques,
where both a generative and discriminative model are trained simultaneously.
A recent trend in malware research consists of treating executables as images
and employing image-based analysis techniques. In this research, we
generate fake malware images using GANs, and we consider the effectiveness
of GANs for malware classification. Specifically, we consider auxiliary classifier
GAN (AC-GAN), which enables us to work with multiclass data.
We find that AC-GAN generates "deep fake" images, in the sense
that our GAN-generated malware images cannot be reliably distinguished
from real malware images, based on convolutional neural networks (CNN).
In addition, the detection capabilities of AC-GAN is shown to
exceed other image-based techniques that have appeared in the literature.
Low grade endometrial stromal sarcoma (LGESS) is rare form of cancer,
accounting for about 0.2% of all uterine cancer cases.
Approximately 75% of LGESS
patients are initially misdiagnosed with leiomyoma, which is a type of benign
tumor that is also known as fibroids.
In this research, uterine tissue biopsy images of potential LGESS patients
are preprocessed using segmentation and staining normalization algorithms.
A variety of classic machine learning and leading deep learning models
are then applied to classify tissue images as either benign or cancerous.
For the classic techniques considered, the highest classification accuracy
we attain is 85%, while our best deep learning model achieves an accuracy of 87%.
These results clearly indicate that properly trained learning algorithms
can play a useful role in the diagnosis of LGESS.
YouTube videos often include
captivating descriptions and intriguing thumbnails designed to increase the number
of views, and thereby increase the revenue for the person who posted the video.
This creates an incentive for people to post clickbait videos, in which the content
might deviate significantly from the title, description, or thumbnail. In effect,
users are tricked into clicking on clickbait videos.
In this research, we consider the challenging problem of detecting clickbait
YouTube videos. We experiment with multiple state-of-the-art machine learning
techniques using a variety of textual features.
Automated techniques to classify malware samples into their respective families
are of critical importance in cybersecurity. Previous research applied
k-means clustering to scores generated by hidden Markov models (HMM)
as a means of dealing with the
malware classification problem. In this research, we follow an analogous
approach, but instead of using HMMs to generate scores, we directly cluster
trained HMMs. We carefully analyze the results obtained over a large
and challenging malware dataset.
Machine learning and deep learning
techniques for malware detection and classification play an important
role in the mitigation of cybersecurity threats. However,
such techniques are often limited
by a lack of training data. Previous research has shown promising
classification results
by treating malware executables as images. In this research, we consider
auxiliary classifier GANs (AC-GAN) for data
augmentation of malware images. We train convolution neural
networks (CNN) to determine how accurately our generated images
model the original malware samples. We also consider adversarial
scenarios, where our augmented data can be used to corrupt the
CNN training process.
The necessity of protecting critical information has been
understood for millennia. Although classic ciphers have inherent weaknesses
in comparison to modern ciphers, many classic ciphers are extremely challenging
to break in practice. Machine learning techniques, such as hidden
Markov models (HMM), have recently been applied with success to various
classic cryptanalysis problems. In this research, we
consider the effectiveness of the deep learning technique
CipherGAN—which is based on the well-established generative
adversarial network (GAN) architecture—for classic
cipher cryptanalysis. We experiment extensively with CipherGAN
on a number of classic ciphers, and we compare our results
to those obtained using HMMs.
https://sjsu.zoom.us/j/89264948001?pwd=Tjg0Rk40SVNOVFgyZ010SkpHVlAwQT09
Password (encrypted with a Caesar's cipher): 875485
https://sjsu.zoom.us/j/84774068125?pwd=Wi9uOUZRL2ZtUU84RzRiUFpvK0JvZz09
Password (encrypted with a Caesar's cipher): 083283
https://sjsu.zoom.us/j/81883032995?pwd=U1o0SnQzUUIzb1MrZmIrb2QraGtVZz09
Password (encrypted with a Caesar's cipher): 887446
https://sjsu.zoom.us/j/82472298802?pwd=bTRzcVZkMjBUd0hsUi8wSzFUOHpaZz09
Password (encrypted with a Caesar's cipher): 081368
https://sjsu.zoom.us/j/85927634979?pwd=Rks0VElpOWFyakdaV09zRTMwblNIZz09
Password (encrypted with a Caesar's cipher): 522389
https://sjsu.zoom.us/j/85775665384?pwd=RERFVVdHb09qdnA2Vkdvd0FUN0NKQT09
Password (encrypted with a Caesar's cipher): 570953
https://sjsu.zoom.us/j/88234305389?pwd=NWE5YzN3WkFtNzUzQk1XUWxNVXIwUT09
Password (encrypted with a Caesar's cipher): 247448
https://sjsu.zoom.us/j/85053362127?pwd=Mlhob2piNHFKeDJQd2pXWXU0RnIwZz09
Password (encrypted with a Caesar's cipher): 809697
https://sjsu.zoom.us/j/89452422530?pwd=VEx5dUxXSVRmWFduc0dnclg2d1A4UT09
Password (encrypted with a Caesar's cipher): 618255