A year-old startup out of China is taking the AI industry by storm after releasing a chatbot which rivals the performance of ChatGPT while using a fraction of the power, cooling, and training expense of what OpenAI, Google, and Anthropicās systems demand. Hereās everything you need to know about Deepseekās V3 and R1 models and why the company could fundamentally upend Americaās AI ambitions.
What is DeepSeek?
DeepSeek (technically, āHangzhou DeepSeek Artificial Intelligence Basic Technology Research Co., Ltd.ā) is a Chinese AI startup that was originally founded as an AI lab for its parent company, High-Flyer, in April, 2023. That May, DeepSeek was spun off into its own company (with High-Flyer remaining on as an investor) and also released its DeepSeek-V2 model. V2 offered performance on par with other leading Chinese AI firms, such as ByteDance, Tencent, and Baidu, but at a much lower operating cost.
The company followed up with the release of V3 in December 2024. V3 is a 671 billion-parameter model that reportedly took less than 2 months to train. Whatās more, according to a recent analysis from Jeffries, DeepSeekās ātraining cost of only US$5.6m (assuming $2/H800 hour rental cost). That is less than 10% of the cost of Metaās Llama.ā Thatās a tiny fraction of the hundreds of millions to billions of dollars that US firms like Google, Microsoft, xAI, and OpenAI have spent training their models.
š Introducing DeepSeek-V3!
Biggest leap forward yet:
ā” 60 tokens/second (3x faster than V2!)
šŖ Enhanced capabilities
š API compatibility intact
š Fully open-source models & papersš 1/n pic.twitter.com/p1dV9gJ2Sd
ā DeepSeek (@deepseek_ai) December 26, 2024
Benchmark tests put V3ās performance on par with GPT-4o and Claude 3.5 Sonnet. A December 2024 Op-Ed in The Hill categorized DeepSeekās success as Americaās āSputnik Moment.ā
DeepSeek released its R1-Lite-Preview model in November 2024, claiming that the new model could outperform OpenAIās o1 family of reasoning models (and do so at a fraction of the price). The company estimates that the R1 model is between 20 and 50 times less expensive to run, depending on the task, than OpenAIās o1. DeepSeek subsequently released DeepSeek-R1 and DeepSeek-R1-Zero in January 2025. The R1 model, unlike its o1 rival, is open source, which means that any developer can use it.
As such V3 and R1 have exploded in popularity since their release, with DeepSeekās V3-powered AI Assistant displacing ChatGPT at the top of the app stores. Venture capitalist Marc Andreesen, in a recent social media post, called DeepSeekās chatbotĀ āone of the most amazing and impressive breakthroughs Iāve ever seenā and a āprofound gift to the world.ā
What can DeepSeek do?
As an open-source large language model, DeepSeekās chatbots can do essentially everything that ChatGPT, Gemini, and Claude can. That includes text, audio, image, and video generation. Whatās more, DeepSeekās newly released family of multimodal models, dubbed Janus Pro, reportedly outperforms DALL-E 3 as well as PixArt-alpha, Emu3-Gen, and Stable Diffusion XL, on a pair of industry benchmarks. DeepSeek-R1, rivaling o1, is specifically designed to perform complex reasoning tasks, while generating step-by-step solutions to problems and establishing ālogical chains of thought,ā where it explains its reasoning process step-by-step when solving a problem.Ā
oh boy #deepseek
ā Alexios Mantzarlis (@mantzarlis.com) 2025-01-27T16:50:40.640Z
What DeepSeekās products canāt do is talk about Tienanmen Square. Or the Yellow Umbrella protests. Or President Xi Jinpingās likeness to Winnie the Pooh. Basically, if itās a subject considered verboten by the Chinese Communist Party, DeepSeekās chatbots will not address it or engage in any meaningful way.
Who can use DeepSeek?
As an open-source LLM, DeepSeekās model can be used by any developer for free. OpenAI charges $200 per month for the Pro subscription needed to access o1. DeepSeekās models are available on the web, through the companyās API, and via mobile apps. You will need to sign up for a free account at the DeepSeek website in order to use it, however the company has temporarily paused new sign ups in response to ālarge-scale malicious attacks on DeepSeekās services.ā Existing users can sign in and use the platform as normal, but thereās no word yet on when new users will be able to try DeepSeek for themselves.
Why is DeepSeek suddenly such a big deal?
Since the release of ChatGPT in November 2023, American AI companies have been laser-focused on building bigger, more powerful, more expansive, more power and resource-intensive large language models. Rather than seek to build more cost-effective and energy-efficient LLMs, companies like OpenAI, Microsoft, Anthropic, and Google instead saw fit to simply brute force the technologyās advancement by, in the American tradition, simply throwing absurd amounts of money and resources at the problem. In 2024 alone, xAI CEO Elon Musk was expected to personally spend upwards of $10 billion on AI initiatives. OpenAI and its partners just announced a $500 billion Project Stargate initiative that would drastically accelerate the construction of green energy utilities and AI data centers across the US. Google plans to prioritize scaling the Gemini platform throughout 2025, according to CEO Sundar Pichai, and is expected to spend billions this year in pursuit of that goal. Meta announced in mid-January that it would spend as much as $65 billion this year on AI development.
DeepSeek just showed the world that none of that is actually necessary ā that the āAI Boomā which has been helping spur the American economy in recent months and which has made GPU companies like Nvidia exponentially more wealthy than they were in October 2023, may be nothing more than a sham. It also calls into question just how much of a lead the US actually has in AI, despite repeatedly banning shipments of leading-edge GPUs to China over the past year.
āThe bottom line is the US outperformance has been driven by tech and the lead that US companies have in AI,ā Keith Lerner, an analyst at Truist, told CNN. āThe DeepSeek model rollout is leading investors to question the lead that US companies have and how much is being spent and whether that spending will lead to profits (or overspending).ā
In short, DeepSeek just beat the American AI industry at its own game, showing that the current mantra of āgrowth at all costsā is no longer valid. āDeepSeek clearly doesnāt have access to as much compute as U.S. hyperscalers and somehow managed to develop a model that appears highly competitive,ā Srini Pajjuri, semiconductor analyst at Raymond James, told CNBC.Ā If a Chinese startup can build an AI model that works just as well as OpenAIās latest and greatest, and do so in under two months and for less than $6 million, then what use is Sam Altman anymore?
āTime will tell if the DeepSeek threat is real ā the race is on as to what technology works and how the big Western players will respond and evolve,ā Michael Block, market strategist at Third Seven Capital, told CNN. āMarkets had gotten too complacent on the beginning of the Trump 2.0 era and may have been looking for an excuse to pull back ā and they got a great one here.ā
Read the full article here