AI models are trained to answer questions, but now they’re being bombarded with hundreds of thousands of them by actors trying to reverse-engineer and steal their intelligence.
Google has said it has detected and disrupted a surge in attempts to steal the capabilities of its Gemini AI models through systematic prompting—a technique known as “model extraction” or “distillation attacks”—according to a new threat intelligence report released this week.
The Google Threat Intelligence Group (GTIG) revealed that throughout 2025, the company identified frequent model extraction attacks from private sector entities worldwide and researchers seeking to clone proprietary logic. These attacks represent a form of intellectual property theft where adversaries use legitimate API access to systematically probe AI models and replicate their capabilities without permission.
How the Attacks Work
Model extraction attacks occur when an adversary uses legitimate access to systematically probe a mature machine learning model to extract information used to train a new model. The technique, called knowledge distillation, allows attackers to transfer knowledge from one model to another, effectively creating a clone at significantly lower cost than training from scratch.
In one notable case study, Google uncovered a coordinated campaign targeting Gemini’s reasoning capabilities. The attack involved over 100,000 prompts designed to coerce the model into outputting full reasoning processes, with questions spanning multiple languages and task types. Google’s systems detected the activity in real time and implemented protections.
A Growing Industry Concern
Google’s disclosure comes amid rising concerns across the AI industry about model theft. OpenAI recently raised similar alarms, stating that DeepSeek and other Chinese companies have been attempting to distill its models. The practice has emerged as a new vector for competitive intelligence and IP theft as AI models become increasingly valuable commercial assets.
Model extraction enables an attacker to accelerate AI model development quickly and at a significantly lower cost, effectively representing a form of intellectual property theft. The risk is particularly acute for organizations offering specialized or custom-tuned models as commercial services.
Google’s Response
While Google did not observe any direct attacks on frontier models from advanced persistent threat actors, the company detected and mitigated model extraction activity through real-time defenses designed to degrade the performance of cloned models.
Model extraction attacks violate Google’s Terms of Service and may be subject to takedowns and legal action. The company emphasized it continuously monitors for extraction patterns and has implemented proactive safeguards to protect its proprietary technology.
The revelation underscores a growing challenge for AI companies: as models become more powerful and commercially valuable, they simultaneously become more attractive targets for competitors and bad actors seeking to replicate years of research and billions in investment through systematic abuse of legitimate access.