Leo Mei | Energy Considerations for Large Pre-trained Neural Networks | ||
Natesh Reddy | Transforming Chatbot Text: A Sequence-to-Sequence Approach to Human-Like Text Generation |
In recent years, neural networks have achieved phenomenal performance,
due to the increasing complexity of model architectures. However,
complex models require massive computational resources and consume substantial
amounts of electricity, which highlights
the potential environmental impact of such models. Previous studies have
demonstrated that substantial redundancies exist in large pre-trained models.
However, this previous work has focused on retaining comparable model
performance, and the direct impact of compression on electricity consumption
appears to have received relatively little attention. By quantifying
the energy usage associated with both uncompressed and compressed models, we
investigate compression as a means of reducing electricity consumption.
We consider nine different pre-trained models,
ranging in size from 8M parameters to 138M parameters. To establish a baseline,
we first train each model without compression and record the electricity usage
during training and the time required for inference.
We then apply three compression techniques, namely,
steganographic capacity reduction, pruning, and low-rank factorization.
In each of the resulting 27 cases we measure the electricity usage
and inference time over a wide range of compression values.
We find that pruning and low-rank factorization offer no
significant improvement, while steganographic capacity reduction
provides major benefits, with respect to training and inference.
We discuss the significance of these findings.
With recent advances in Large Language Models (LLMs) such as ChatGPT,
the boundary between text written by humans and that created by AI
has become blurred. This poses a threat to systems designed to detect
AI-generated content. In this work, we adopt a novel strategy to
adversarially transform GPT-generated text using sequence-to-sequence (Seq2Seq)
models, for the purpose of making the text more human-like.
We focus on improving GPT-generated sentences by including significant
linguistic, structural, and semantic components that are typical of
human-authored text. Experiments show that models trained to
detect AI-generate text fail to reliably distinguish our Seq2Seq-modified
text from human-generated text. However, once retrained on this
augmented data, models achieve improved accuracy in distinguishing
AI-generated from human-generated content. This work adds to the
accumulating knowledge of text transformation as a tool for both
attack—in the sense of defeating classification models—and
defense—in the sense of improved classifiers—thereby advancing
our understanding of AI-generated text.