07-21-2025, 07:47 PM
Welcome to the frontier where offensive security meets artificial intelligence.
This thread is a living index of the core concepts, tools, research papers, and attack vectors in Adversarial Machine Learning (AML) — the art of abusing, bypassing, or hardening AI systems.
What is Adversarial Machine Learning?
AML focuses on exploiting weaknesses in machine learning models to:
Offensive Techniques
Tools & Frameworks
You are not allowed to view links. Register or Login to view. : NLP adversarial testing
You are not allowed to view links. Register or Login to view. : Evasion & defense methods
You are not allowed to view links. Register or Login to view. : Comprehensive AML testing
You are not allowed to view links. Register or Login to view. : White-box and black-box attacks
You are not allowed to view links. Register or Login to view. : CV attacks on PyTorch/TensorFlow
Must-Read Papers
LLM-Specific Attacks (GPT, Claude, etc.)
Let’s build a solid knowledge base for adversarial AI security.
If you're reading a cool paper, building a model-breaking tool, or fuzzing GPT — post it here.
? “Attackers think in graphs. ML models think in probabilities. We think in both.”
This thread is a living index of the core concepts, tools, research papers, and attack vectors in Adversarial Machine Learning (AML) — the art of abusing, bypassing, or hardening AI systems.
What is Adversarial Machine Learning?
AML focuses on exploiting weaknesses in machine learning models to:
- Fool classifiers (e.g., malware labeled as benign)
- Poison training data
- Steal models or data
- Craft inputs that trigger unexpected behavior
Quote:If traditional apps have logic bugs, AI models have decision boundary bugs.
Offensive Techniques
- Evasion Attacks – Modify input to cause misclassification (e.g., making malware look benign).
- Model Poisoning – Inject malicious data during training to corrupt future predictions.
- Model Extraction – Reverse engineer black-box models using API access.
- Membership Inference – Identify whether a data point was in the training set.
- Prompt Injection (LLMs) – Manipulate instructions and outputs in AI chatbots.
Tools & Frameworks
You are not allowed to view links. Register or Login to view. : NLP adversarial testing
You are not allowed to view links. Register or Login to view. : Evasion & defense methods
You are not allowed to view links. Register or Login to view. : Comprehensive AML testing
You are not allowed to view links. Register or Login to view. : White-box and black-box attacks
You are not allowed to view links. Register or Login to view. : CV attacks on PyTorch/TensorFlow
Must-Read Papers
- Explaining and Harnessing Adversarial Examples – You are not allowed to view links. Register or Login to view.
- Backdooring Neural Networks – You are not allowed to view links. Register or Login to view.
- Adversarial Examples Are Not Bugs, They Are Features – You are not allowed to view links. Register or Login to view.
- Universal Adversarial Perturbations – You are not allowed to view links. Register or Login to view.
LLM-Specific Attacks (GPT, Claude, etc.)
- Prompt Injection & Jailbreaks
- Training Data Leakage
- Fine-Tuning Exploits
- Prompt Leaking via Reverse Prompting
Let’s build a solid knowledge base for adversarial AI security.
If you're reading a cool paper, building a model-breaking tool, or fuzzing GPT — post it here.
? “Attackers think in graphs. ML models think in probabilities. We think in both.”