Anthropic says Alibaba illicitly extracted Claude AI model capabilities

When AI Learns from AI: Is Alibaba's Claude 'Distillation' Theft or a New Form of Scraping?

The ongoing evolution of AI continues to challenge traditional notions of intellectual property, with the latest controversy centering on Anthropic's accusation that Alibaba has been "distilling" Claude's capabilities. This incident highlights a core tension within the industry: companies like Anthropic fight to protect their models, yet many question if these same entities aren't merely complaining when their own methods of data acquisition are turned against them. Given the extensive public data scraping used for foundational model training, the definition of "fair use" in generative AI is increasingly ambiguous, as seen in ongoing debates like the New York Times v. OpenAI lawsuit.

The Incident: Anthropic's Claims Against Alibaba

Anthropic, the US AI company behind the Claude model, has accused Chinese tech giant Alibaba of what it describes as the largest known attack of its kind on their systems. Between April 22 and June 5, 2026, operators allegedly affiliated with Alibaba and its AI lab, Alibaba Qwen, generated over 28.8 million exchanges with Claude, accomplished using almost 25,000 fraudulent accounts.

Anthropic asserts the goal was to illicitly extract Claude's advanced capabilities through "distillation." This was not casual use; it was a targeted campaign designed to accelerate China's ability to replicate Anthropic's advanced Mythos Preview capabilities. This follows similar accusations in February 2026, when Anthropic cited DeepSeek and two other Chinese startups for comparable actions. Alibaba did not immediately respond to a request for comment.

Data flowing from a large model to a smaller one.

How "Distillation" Works and Its Implications

Unlike traditional cyberattacks that aim to breach networks or steal code, AI "distillation" is an inference-based exfiltration technique. Think of it as a less capable model, like Alibaba's Qwen, learning directly from a highly capable source model such as Claude. The target model learns by observing the source model's responses to a vast array of prompts, effectively absorbing its "knowledge" without the original foundational training cost.

The attack chain typically unfolds as follows: Alibaba-affiliated operators established almost 25,000 fraudulent accounts to access Claude's API, bypassing usage limits and obscuring their activity – a common tactic for API abuse. They then fed over 28.8 million prompts to Claude, covering diverse topics and tasks to probe the model's reasoning, coding abilities, and agentic capabilities, effectively mapping its internal state. Claude's detailed responses were collected for each prompt, constituting the exfiltrated "intelligence." These collected responses then became training data for Alibaba's smaller model, allowing it to mimic Claude's patterns, reasoning, and style, acquiring its capabilities without the original, expensive R&D. While traditional cyberattacks are often mapped to frameworks like MITRE ATT&CK (e.g., T1078 for Valid Accounts in the initial access phase), the inference-based exfiltration of model capabilities through distillation presents a novel challenge that current threat intelligence frameworks are still evolving to categorize, highlighting the unique nature of this intellectual property threat.

The practical impact is a competitor gaining a significant shortcut in AI development. A less capable model can rapidly improve its performance by effectively "learning" from a more advanced one. This isn't a case of stolen code, but rather the extraction of intelligence embedded within the model's weights and architecture – a core intellectual property asset.

The Realistic Impact: IP, Ethics, and Tech Rivalry

For Anthropic, the immediate consequence is clear: a competitor potentially gains a substantial advantage, eroding their competitive edge and the value of their R&D. For the broader AI industry, this raises fundamental questions about protecting proprietary models when their utility inherently involves interaction.

However, the situation is complex. Discussions across developer forums often highlight skepticism regarding Anthropic's stance. Many point out that US AI companies, including Anthropic, OpenAI, and Google, built their foundational models by scraping vast amounts of internet data. This often included copyrighted material, personal blogs, and creative works, frequently without explicit consent or compensation. The argument is direct: if you can scrape the internet to train your model, why can't someone else use your model's outputs to train theirs? This perceived hypocrisy significantly muddies the legal and ethical waters.

This incident also aligns with the intensifying US-China tech rivalry. The US government has long accused China of industrial-scale IP theft. Anthropic's Fable and Mythos AI models were already restricted for non-US citizens by government order, leading to suspended access for everyone, underscoring the US's concern over the transfer of advanced AI capabilities. This alleged distillation effort further fuels that concern, framing it as a national security issue.

Mitigations and Future Directions

From a technical perspective, more sophisticated detection and prevention methods are being explored or implemented across the industry. This includes deploying sophisticated behavioral analytics platforms, akin to those used in fraud detection, to identify unusual API usage patterns. A key challenge lies in distinguishing legitimate high-volume usage from coordinated automated activity, requiring the flagging of accounts exhibiting identical query patterns or sudden, massive spikes from new IPs. Additionally, implementing API gateway policies that enforce adaptive rate limits based on usage heuristics and requiring multi-factor authentication for API key generation would make creating and scaling fraudulent accounts significantly harder. Finally, research is exploring techniques to embed subtle, undetectable 'watermarks' into model outputs. Recent work shows promise in creating verifiable traces that could prove distillation, though the field is still nascent, facing challenges in ensuring watermark robustness against adversarial attacks and maintaining output quality at scale.

Ultimately, addressing this challenge requires more than just technical defenses; it demands a fundamental re-evaluation of intellectual property in the AI era, as current legal frameworks simply weren't designed for models that learn from other models' outputs. We require clearer international agreements and potentially new legal precedents that define what constitutes "theft" or "fair use" when the "product" is an AI's learned capabilities, not just its code or data. Without this clarity, the industry risks a perpetual cycle of accusations, making it increasingly difficult to distinguish genuine innovation from illicit appropriation.