NEW DELHI — As
Chinese artificial intelligence (AI) DeepSeek continues to rattle the
technology world amid the US-China trade war, OpenAI now suspects that ChatGPT
data has reportedly been used by DeepSeek to train its cheap AI models.
Sam Altman-run OpenAI told the Financial Times that it found
evidence linking DeepSeek to the use of “distillation,” which is a common
technique developers use to train AI models by extracting data from large
language models (LLMs).
OpenAI and Microsoft are now probing whether the Chinese
rival used their APIs to train DeepSeek’s own models.
OpenAI reportedly spent $100 million to train its GPT-4
model.
According to David Sacks, US President Donald Trump’s
artificial intelligence czar, “it is possible” that IP theft had occurred in
the case of DeepSeek.
“There’s substantial evidence that what DeepSeek did here is
they distilled knowledge out of OpenAI models and I don’t think OpenAI is very
happy about this,” Sacks told Fox News.
In a statement, OpenAI said that “We know PRC (China) based
companies — and others — are constantly trying to distill the models of leading
US AI companies”.
Meanwhile, Euroconsumers, a coalition of consumer groups in
Europe, has filed a complaint to the Italian Data Protection Authority related
to how DeepSeek handles personal data in relation to GDPR.
The Italian DPA said that “the data of millions of Italians
is at risk” and has given DeepSeek 20 days to respond.
DeepSeek is backed by High-Flyer Capital Management, a
Chinese quantitative hedge fund. AI enthusiast Liang Wenfeng co-founded
High-Flyer in 2015.
Meanwhile, DeepSeek‘s Android app has taken the top spot on
Google Play Store. The app is essentially a ChatGPT alternative that’s powered
by the Chinese lab’s V3 model.