Gauri Anil Godghase | Distinguishing Chatbot from Human | ||
Atharva Khadilkar | Malware Detection Using QR and Aztec Code Representations | ||
Rohit Mapakshi | An Empirical Analysis of Adversarial Attacks in Federated Learning | ||
Sarvagya Bhargava | Robustness of Learning Models to Label Flipping Attacks |
There have been many recent advances in the field of Generative Artificial Intelligence
and Large Language Models (LLM), with the GPT-3 (ChatGPT) model being one of the frontrunners.
These LLMs have become so powerful that they can produce human-like text. In this research,
we consider the problem of distinguishing chatbot generated text from human generated text
using machine learning. We collect a large dataset of human generated paragraphs,
and we use ChatGPT to generate a comparable collection paragraphs. We then train a wide
range of machine learning models on various features extracted from this data.
We find that it is surprisingly easy to distinguish chatbot generated text from human
generated text, with our best models yielding accuracies of about 99%. We analyze
our models and features so as to better understand why chatbot text stands out
from human text.
In recent years, the use of image-based techniques for malware detection has
gained prominence, with numerous studies demonstrating the efficacy of deep learning
approaches such as convolutional neural networks (CNNs) in classifying images derived
from executable files. In this paper, we consider an innovative method that relies on
an image conversion process that consists of transforming executable files into QR and
Aztec codes. These codes capture structural patterns in a format that may enhance the
learning capabilities of CNNs. We design and implement CNN architectures tailored to
the unique properties of these codes and apply them in a comprehensive analysis involving
two extensive malware datasets, alongside a significant corpus of benign executables.
Our results, which surpass those of comparable studies, suggest that the choice of
image conversion strategy is crucial, and that using QR and Aztec codes offers a
promising direction for future research in malware detection.
In this research, we experimentally analyze the susceptibility of selected
Federated Learning (FL) systems to the presence of adversarial clients.
We find that temporal attacks significantly affect model performance in FL,
especially when the adversaries are active throughout and during the ending
rounds of the FL process. We consider a wide variety of machine learning models,
including Multinominal Logistic Regression, Support Vector Classifier (SVC),
Neural Network models like Multilayer Perceptron (MLP),
Convolution Neural Network (CNN), Recurrent Neural Network (RNN),
Long Short-Term Memory (LSTM), as well as tree-based machine learning models;
specifically, Random Forest and XGBoost. Our results highlight the effectiveness
of temporal attacks and the need to develop strategies to make the FL process
more robust. We also explore defense mechanisms, including outlier detection
in the aggregation algorithm.
In this research, we compare traditional machine learning and deep learning models trained on a malware dataset when subjected to adversarial attacks based on label flipping. We investigate the robustness of different models when faced with misleading labels, assessing their ability to maintain their accuracy in the face of such adversarial manipulations. We find that different models differ substantially in their robustness to label flipping attacks.