A curated, but probably biased and incomplete, list of LLM-generated text detection resources.
If you want to contribute to this list, feel free to pull a request. Also you can contact Ruixiang Tang from the Data Lab at Rice University through email: [email protected].
The emergence of large language models (LLMs) has resulted in the production of LLM-generated texts that is highly sophisticated and almost indistinguishable from texts written by humans. However, this has also sparked concerns about the potential misuse of such texts, such as spreading misinformation and causing disruptions in the education system.
we group exitsting methods into two categories: black-box detection and white-box detection. Black-box detection methods are limited to API-level access to LLMs. They rely on collecting text samples from human and machine sources, respectively, to train a classification model that can be used to discriminate between LLM- and human-generated texts.
An alternative is white-box detection, in this scenario, the detector has full access to the LLMs and can control the model's generation behavior for traceability purposes. In practice, black-box detectors are commonly constructed by external entities, whereas white-box detection is generally carried out by LLM developers.
-
- Data Collection
- Detection Feature Selection
- Classification Model
-
- Post-hoc Watermarks
- Inference Time Watermarks
- The Science of Detecting LLM-Generated Texts
- Machine Generated Text: A Comprehensive Survey of Threat Models and Detection Methods
- Automatic Detection of Machine Generated Text: A Critical Survey
- Deepfake Text Detection: Limitations and Opportunities
- A Brief Survey on Deep Learning Based Data Hiding
- How Close is ChatGPT to Human Experts? Comparison Corpus, Evaluation, and Detection
- GLTR: Statistical Detection and Visualization of Generated Text
- Automatic Detection of Generated Text is Easiest when Humans are Fooled
- Neural Deepfake Detection with Factual Structure of Text
- Defending Against Neural Fake News
- All That’s ‘Human’ Is Not Gold: Evaluating Human Evaluation of Generated Text
- The Limitations of Stylometry for Detecting Machine-Generated Fake News
- MAUVE: Measuring the Gap Between Neural Text and Human Text using Divergence Frontiers
- RoFT: A Tool for Evaluating Human Detection of Machine-Generated Text
- TURINGBENCH: A Benchmark Environment for Turing Test in the Age of Neural Text Generation
- Cross-Domain Detection of GPT-2-Generated Technical Text
- A Benchmark Corpus for the Detection of Automatically Generated Text in Academic Publications
- Unsupervised and Distributional Detection of Machine-Generated Text
- Threat Scenarios and Best Practices for Neural Fake News Detection
- Real or Fake Text?: Investigating Human Ability to Detect Boundaries Between Human-Written and Machine-Generated Text
- Unraveling the Mystery of Artifacts in Machine Generated Text
- Detecting Bot-Generated Text by Characterizing Linguistic Accommodation in Human-Bot Interactions
- Automatic Detection of Machine Generated Texts: Need More Tokens
- CoCo: Coherence-Enhanced Machine-Generated Text Detection Under Data Limitation With Contrastive Learning
- Feature-based detection of automated language models: tackling GPT-2, GPT-3 and Grover
- Release strategies and the social impacts of language models
- Adversarial Robustness of Neural-Statistical Features in Detection of Generative Transformer
- Attribution and Obfuscation of Neural Text Authorship: A Data Mining Perspective
- Adversarial Watermarking Transformer: Towards Tracing Text Provenance with Data Hiding
- Through the Looking Glass: Learning to Attribute Synthetic Text Generated by Language Models
- Tracing Text Provenance via Context-Aware Lexical Substitution
- On Information Hiding in Natural Language Systems
- DeepHider: A Covert NLP Watermarking FrameworkBasedon Multi-task Learning