DeepSeek: everything you need to know about the AI that dethroned ChatGPT

News Room

A year-old startup out of China is taking the AI industry by storm after releasing a chatbot which rivals the performance of ChatGPT while using a fraction of the power, cooling, and training expense of what OpenAI, Google, and Anthropicā€™s systems demand. Hereā€™s everything you need to know about Deepseekā€™s V3 and R1 models and why the company could fundamentally upend Americaā€™s AI ambitions.

What is DeepSeek?

DeepSeek (technically, ā€œHangzhou DeepSeek Artificial Intelligence Basic Technology Research Co., Ltd.ā€) is a Chinese AI startup that was originally founded as an AI lab for its parent company, High-Flyer, in April, 2023. That May, DeepSeek was spun off into its own company (with High-Flyer remaining on as an investor) and also released its DeepSeek-V2 model. V2 offered performance on par with other leading Chinese AI firms, such as ByteDance, Tencent, and Baidu, but at a much lower operating cost.

The company followed up with the release of V3 in December 2024. V3 is a 671 billion-parameter model that reportedly took less than 2 months to train. Whatā€™s more, according to a recent analysis from Jeffries, DeepSeekā€™s ā€œtraining cost of only US$5.6m (assuming $2/H800 hour rental cost). That is less than 10% of the cost of Metaā€™s Llama.ā€ Thatā€™s a tiny fraction of the hundreds of millions to billions of dollars that US firms like Google, Microsoft, xAI, and OpenAI have spent training their models.

šŸš€ Introducing DeepSeek-V3!

Biggest leap forward yet:
āš” 60 tokens/second (3x faster than V2!)
šŸ’Ŗ Enhanced capabilities
šŸ›  API compatibility intact
šŸŒ Fully open-source models & papers

šŸ‹ 1/n pic.twitter.com/p1dV9gJ2Sd

ā€” DeepSeek (@deepseek_ai) December 26, 2024

Benchmark tests put V3ā€™s performance on par with GPT-4o and Claude 3.5 Sonnet. A December 2024 Op-Ed in The Hill categorized DeepSeekā€™s success as Americaā€™s ā€œSputnik Moment.ā€

DeepSeek released its R1-Lite-Preview model in November 2024, claiming that the new model could outperform OpenAIā€™s o1 family of reasoning models (and do so at a fraction of the price). The company estimates that the R1 model is between 20 and 50 times less expensive to run, depending on the task, than OpenAIā€™s o1. DeepSeek subsequently released DeepSeek-R1 and DeepSeek-R1-Zero in January 2025. The R1 model, unlike its o1 rival, is open source, which means that any developer can use it.

As such V3 and R1 have exploded in popularity since their release, with DeepSeekā€™s V3-powered AI Assistant displacing ChatGPT at the top of the app stores. Venture capitalist Marc Andreesen, in a recent social media post, called DeepSeekā€™s chatbotĀ ā€œone of the most amazing and impressive breakthroughs Iā€™ve ever seenā€ and a ā€œprofound gift to the world.ā€

What can DeepSeek do?

As an open-source large language model, DeepSeekā€™s chatbots can do essentially everything that ChatGPT, Gemini, and Claude can. That includes text, audio, image, and video generation. Whatā€™s more, DeepSeekā€™s newly released family of multimodal models, dubbed Janus Pro, reportedly outperforms DALL-E 3 as well as PixArt-alpha, Emu3-Gen, and Stable Diffusion XL, on a pair of industry benchmarks. DeepSeek-R1, rivaling o1, is specifically designed to perform complex reasoning tasks, while generating step-by-step solutions to problems and establishing ā€œlogical chains of thought,ā€ where it explains its reasoning process step-by-step when solving a problem.Ā 

oh boy #deepseek

ā€” Alexios Mantzarlis (@mantzarlis.com) 2025-01-27T16:50:40.640Z

What DeepSeekā€™s products canā€™t do is talk about Tienanmen Square. Or the Yellow Umbrella protests. Or President Xi Jinpingā€™s likeness to Winnie the Pooh. Basically, if itā€™s a subject considered verboten by the Chinese Communist Party, DeepSeekā€™s chatbots will not address it or engage in any meaningful way.

Who can use DeepSeek?

As an open-source LLM, DeepSeekā€™s model can be used by any developer for free. OpenAI charges $200 per month for the Pro subscription needed to access o1. DeepSeekā€™s models are available on the web, through the companyā€™s API, and via mobile apps. You will need to sign up for a free account at the DeepSeek website in order to use it, however the company has temporarily paused new sign ups in response to ā€œlarge-scale malicious attacks on DeepSeekā€™s services.ā€ Existing users can sign in and use the platform as normal, but thereā€™s no word yet on when new users will be able to try DeepSeek for themselves.

Why is DeepSeek suddenly such a big deal?

Since the release of ChatGPT in November 2023, American AI companies have been laser-focused on building bigger, more powerful, more expansive, more power and resource-intensive large language models. Rather than seek to build more cost-effective and energy-efficient LLMs, companies like OpenAI, Microsoft, Anthropic, and Google instead saw fit to simply brute force the technologyā€™s advancement by, in the American tradition, simply throwing absurd amounts of money and resources at the problem. In 2024 alone, xAI CEO Elon Musk was expected to personally spend upwards of $10 billion on AI initiatives. OpenAI and its partners just announced a $500 billion Project Stargate initiative that would drastically accelerate the construction of green energy utilities and AI data centers across the US. Google plans to prioritize scaling the Gemini platform throughout 2025, according to CEO Sundar Pichai, and is expected to spend billions this year in pursuit of that goal. Meta announced in mid-January that it would spend as much as $65 billion this year on AI development.

DeepSeek just showed the world that none of that is actually necessary ā€” that the ā€œAI Boomā€ which has been helping spur the American economy in recent months and which has made GPU companies like Nvidia exponentially more wealthy than they were in October 2023, may be nothing more than a sham. It also calls into question just how much of a lead the US actually has in AI, despite repeatedly banning shipments of leading-edge GPUs to China over the past year.

ā€œThe bottom line is the US outperformance has been driven by tech and the lead that US companies have in AI,ā€ Keith Lerner, an analyst at Truist, told CNN. ā€œThe DeepSeek model rollout is leading investors to question the lead that US companies have and how much is being spent and whether that spending will lead to profits (or overspending).ā€

In short, DeepSeek just beat the American AI industry at its own game, showing that the current mantra of ā€œgrowth at all costsā€ is no longer valid. ā€œDeepSeek clearly doesnā€™t have access to as much compute as U.S. hyperscalers and somehow managed to develop a model that appears highly competitive,ā€ Srini Pajjuri, semiconductor analyst at Raymond James, told CNBC.Ā If a Chinese startup can build an AI model that works just as well as OpenAIā€™s latest and greatest, and do so in under two months and for less than $6 million, then what use is Sam Altman anymore?

ā€œTime will tell if the DeepSeek threat is real ā€” the race is on as to what technology works and how the big Western players will respond and evolve,ā€ Michael Block, market strategist at Third Seven Capital, told CNN. ā€œMarkets had gotten too complacent on the beginning of the Trump 2.0 era and may have been looking for an excuse to pull back ā€” and they got a great one here.ā€






Read the full article here

Share This Article
Leave a comment

Leave a Reply

Your email address will not be published. Required fields are marked *