Is distilling from GPT-4 legal? The OpenAI–DeepSeek question

If distillation is this powerful, an obvious question follows: can you legally distill from a model you don't own — say, a frontier model behind a paid API? The honest answer in mid-2026 is it's contested, and the law is genuinely unsettled. Here's the actual state of play, with the speculation clearly labeled.

It's a contract fight, not (yet) a copyright one

The first thing to understand: the restrictions on distillation are contractual, not obviously grounded in copyright. OpenAI, Anthropic, Mistral, and xAI all include clauses in their terms of service that bar using their outputs to train competing models — sometimes called anti-distillation or anti-competitive-use clauses.

That's a meaningfully different legal posture than "this is copyright infringement." Whether a model's raw outputs are even protectable is legally untested. So the live theories are breach of contract and possibly computer-fraud statutes — not a clean copyright claim.

The OpenAI–DeepSeek timeline

This stopped being abstract in 2025:

January 2025 — OpenAI and Microsoft said publicly they had evidence DeepSeek may have "inappropriately" used OpenAI's outputs.
February 12, 2026 — OpenAI escalated, submitting a memo to the U.S. House Select Committee on China accusing DeepSeek of "free-riding" via distillation.

The alleged specifics (none proven): a high degree of stylistic similarity to ChatGPT, and abnormal API traffic routed through obfuscated proxies and large numbers of throwaway accounts.

What's actually established vs. alleged

It's worth being precise, because this story is often reported as settled when it isn't:

Established: The major labs' terms restrict output-based training. OpenAI publicly raised the accusation and escalated it to Congress.
Alleged, not proven: That DeepSeek specifically distilled OpenAI models. DeepSeek has not confirmed it.
Notably absent: As of mid-2026, no lawsuit has been filed. Any eventual claim would most plausibly sound in breach of contract or computer-fraud law — not copyright — precisely because the protectability of model outputs is unresolved.

The irony nobody can quite avoid

Analysts have noted the awkward backdrop: much of the entire industry was built on "recursive learning" from data of contested provenance. A field that trained on the open web objecting to others training on its outputs is a tension the courts haven't resolved and the labs would mostly rather not litigate in public.

What it means for you

If you're learning to distill — which is what this site is for — the practical takeaways:

Distilling from open-weight models (Llama, Qwen, Gemma, DeepSeek's open releases, and the many models with permissive licenses) sidesteps the whole question. This is where almost all of the open distillation work — including the DeepSeek-R1 distilled models — actually happens. Check each model's license, but the open ecosystem is vast.
Distilling within a provider's own platform is explicitly sanctioned — OpenAI, Azure, Vertex, and Bedrock all offer distillation as a service.
Distilling one vendor's closed model to build a competitor is the contractually risky path, and the one the headlines are about.

None of this is legal advice, and the situation is moving. But the shape of it is clear: the fight is about contracts and competition, the law is untested, and the open-weights ecosystem is more than rich enough to learn and build on without going near the gray zone.