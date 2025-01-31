A new report from SemiAnalysis estimated that DeepSeek's hardware spend is "well higher than $500M."

China's DeepSeek app took off in the U.S. this week.

DeepSeek said the model's total training costs amounted to almost $5.6 million, a fraction of the amount rivals have spent.

China's DeepSeek became the biggest topic in tech this week, with many in the industry and on Wall Street focused on a single number: $6 million.

In DeepSeek's paper about its newest artificial intelligence model, the company said that its total training costs amounted to $5.576 million, based on the rental price of Nvidia's graphics processing units. DeepSeek included a clear caveat, saying that the number included only the model's "official training" and excluded the costs tied to "prior research and ablation experiments on architectures, algorithms, or data."

Early in the week, DeepSeek's AI Assistant took the coveted spot for most-downloaded free app in the U.S. on Apple's App Store, dethroning OpenAI's ChatGPT. Global tech stocks sold off, with chipmakers Nvidia and Broadcom losing a combined $800 billion in market cap on Monday.

A new report from SemiAnalysis, a semiconductor research and consulting firm, added more context to DeepSeek's expenses. The firm estimated that DeepSeek's hardware spend is "well higher than $500M over the company history," adding that R&D costs and total cost of ownership are significant. Generating "synthetic data" for the model to train on would require "considerable amount of compute," SemiAnalysis wrote.

The report said the Claude 3.5 Sonnet from Anthropic cost "$10s of millions to train," but noted that Anthropic raised billions for dollars from Amazon and Google, an indication of how much more money is required to run the models and the company.

"It's because they have to experiment, come up with new architectures, gather and clean data, pay employees, and much more," SemiAnalysis said.

DeepSeek's own paper does not include an estimation of its compute costs. The company didn't immediately respond to a request for comment.

"To be clear DeepSeek is unique in that they achieved this level of cost and capabilities first," SemiAnalysts wrote. The firm added that DeepSeek's R1 "is a very good model" and that "catching up to the reasoning edge this quickly is objectively impressive."

Experts and analysts this week touted the quality of DeepSeek's model, and noted how impressive it is considering the U.S. curbed chip exports to China three times in three years. That led to concerns that the U.S. is falling behind its chief adversary in a market that's predicted to top $1 trillion in revenue within a decade.

Bernstein analysts wrote in a note Monday that "according to the many (occasionally hysterical) hot takes we saw [over the weekend,] the implications range anywhere from 'That's really interesting' to 'This is the death-knell of the AI infrastructure complex as we know it.'"

DeepSeek was founded in 2023 by Liang Wenfeng, co-founder of High-Flyer, a quantitative hedge fund focused on AI. The AI startup reportedly grew out of the hedge fund's AI research unit in April 2023 to focus on large language models and reaching artificial general intelligence, or AGI — a branch of AI that equals or surpasses human intellect on a wide range of tasks, and that OpenAI and others are pursuing.

DeepSeek is still wholly owned by and funded by High-Flyer, according to analysts at Jefferies.

The buzz around DeepSeek began picking up steam earlier this month, when the startup released R1, its reasoning model that rivals OpenAI's o1. It's open-source, meaning that any AI developer can use it.

Like other Chinese chatbots, DeepSeek's has limitations on certain topics: When asked about some of Chinese leader Xi Jinping's policies, for instance, DeepSeek reportedly steers the user away from similar lines of questioning.

OpenAI CEO Sam Altman has praised the model publicly, but the company has also said it believes there's evidence that DeepSeek improperly harvested OpenAI data to build its product.

At an event in Washington, D.C., on Thursday hosted by OpenAI, Altman said DeepSeek is "clearly a great model."

"This is a reminder of the level of competition and the need for democratic Al to win," he said. He said it also points to the "level of interest in reasoning, the level of interest in open source."

