In the last week of January 2025, the AI market, and the broader stock market, were shaken by the sudden rise of DeepSeek, a Chinese open-source AI model. DeepSeek R1 works better than previous leading AI models (Meta’s Llama 3.1, OpenAI’s GPT-4, and Anthropic’s Claude Sonnet 3.5) in complex tasks while using much cheaper chips.
The impact was severe: Nvidia, a leading chipmaker, lost $600 billion in market value—the largest single-day drop in U.S. stock market history. The sell-off also affected other major tech companies, with Microsoft declining 2.14% and Alphabet (Google’s parent company) falling over 4%.
What’s driving DeepSeek’s success, and how is it reshaping the AI industry while challenging the belief that quantum computing is essential to its future? Let’s dive in.
How DeepSeek R1 Works: From The First Model Launch To Market Breakthroughs
DeepSeek is a Chinese company founded in July 2023 by entrepreneur Liang Wenfeng, who runs a hedge fund, High-Flyer Capital, that uses AI to identify patterns in stock prices. Notably, Wenfeng is reported to have stockpiled around 50,000 Nvidia A100 chips, which have been banned from export to China since September 2022. Experts believe this stockpile played a crucial role in developing DeepSeek’s powerful AI models by combining these advanced chips with cheaper, less sophisticated ones.
After launching its initial set of models — DeepSeek Coder, LLM, and Chat — in November 2023, the company made waves with its cost-efficient DeepSeek-V2 model. This general-purpose system, capable of analyzing both text and images, was much cheaper to run than competitors at the time, forcing companies like ByteDance and Alibaba to reduce their prices.
In December 2024, DeepSeek released its powerful large language model, V3. Internal tests showed that V3 outperformed both open-source models like Meta’s Llama and API-based models like GPT-4. Remarkably, V3’s development cost just $5.58 million — much less than GPT-4’s $100 million. It was trained using 2,000 specialized NVIDIA H800 GPUs, far fewer than the 16,000 H100 chips used by other companies.
In January 2025, DeepSeek introduced R1, a “reasoning” model based on V3. R1 excels at tasks requiring complex context, like reading comprehension and strategic planning. While it takes a bit longer to process solutions, it is more reliable in areas such as science, math, and physics. DeepSeek R1 works on par with OpenAI’s o1 model in terms of performance.
DeepSeek also developed reasoning versions of smaller open-source models, making them available for use on home computers.
However, there is a notable drawback to DeepSeek’s R1 and V3 models. As Chinese-developed AI, they are bound by China’s internet regulations, which ensure that the AI’s responses align with “core socialist values.” For instance, R1 avoids answering questions related to sensitive topics like Taiwan’s autonomy.
READ: 9 Scary Facts About AI for Business
Why DeepSeek R1 Works Good? Two Innovative Techniques Driving The AI Revolution
DeepSeek’s success comes from achieving outstanding results with fewer resources, thanks to two groundbreaking techniques.
The first technique revolves around the mathematical concept of “sparsity.” Sparsity can manifest in various ways in AI models. It can involve eliminating unnecessary data that doesn’t affect the model’s output, or in more advanced cases, removing entire sections of a neural network while still maintaining the model’s effectiveness. DeepSeek is an example of the second approach.
For example, DeepSeek V3 model has around 671 billion parameters or “weights”, which determine how the model processes input and generates output, like text or images. However, only a small fraction is used for each input. Predicting which parameters are necessary for a specific task is a challenge, but DeepSeek solves this with an innovative technique that pre-identifies the necessary parameters, allowing the model to focus its training efforts on only those. This approach leads to significant reductions in training time compared to traditional methods.
The second technique involves a mathematical concept known as “multi-head latent attention.” In DeepSeek V3 model, this technique is applied to compress the memory cache, which holds the most recent input text of a prompt — one of the largest consumers of memory and bandwidth. This approach enhances memory management and reduces the memory load, leading to better overall performance.
How Could DeepSeek R1 Disrupt The AI Industry?
DeepSeek’s open-source approach is a wake-up call for the whole AI industry. Released under the MIT License, its models are freely available for download, modification, and commercial use. As of January 27, 2025, developers on Hugging Face have created over 500 derivative models of DeepSeek R1 model, with a combined 2.5 million downloads.
For tech companies relying on proprietary AI, this presents a significant threat. Freely available AI models could erode profit margins and shift investor attention away from big tech, as the value of proprietary systems diminishes in favor of more accessible, cost-efficient solutions.
However, where some see disruption, others see opportunity. Small AI startups, researchers, and independent developers now have access to cutting-edge AI without the massive infrastructure costs. For consumers, this could mean AI models running directly on personal devices — laptops, phones, and wearables — eliminating the need for expensive cloud-based subscriptions.
More importantly, DeepSeek’s success challenges the long-standing belief that AI’s future depends solely on quantum computing. Historically, quantum computing has been seen as the key to solving AI’s most complex challenges, such as optimization and large-scale problem-solving, due to its ability to process vast amounts of data simultaneously through quantum superposition and entanglement. However, DeepSeek’s breakthrough demonstrates that through innovations in chip design, optimization techniques, and algorithm development, classical systems are capable of delivering powerful AI models at scale.
As quantum computing remains a distant reality for most, focusing on accessible, affordable advancements in classical computing may be the most pragmatic path forward for AI innovation.
READ: AI in 2024: Overview of Key AI Innovations and Their Impact on the Tech Industry
JetSoftPro: Leading the Charge in AI Innovation
At JetSoftPro, we harness the power of cutting-edge AI advancements, such as DeepSeek’s groundbreaking models, to deliver impactful and scalable solutions. By focusing on efficiency, cost optimization, and strict data privacy, we help businesses successfully integrate advanced AI technologies.
Contact us to learn how we can help you unlock the full potential of AI for your business.
The post How DeepSeek R1 Works and Changes the AI Market appeared first on JetSoftPro | Custom Technology Solutions & Software Development.