Chinese AI startup DeepSeek has made one of the most aggressive pricing moves in the industry by permanently slashing the cost of its flagship V4-Pro AI model by 75%. The new pricing, announced without fanfare, brings the cost of using DeepSeek’s most powerful large language model down to between 0.025 and 6 yuan per million tokens, depending on workload type, a stark drop from the previous 0.1 to 24 yuan range. For developers building AI applications, agents, or services, this reduction can dramatically lower operational expenses and accelerate adoption of generative AI in China.
DeepSeek, founded in 2023, has rapidly become a notable player in the Chinese AI ecosystem. The company’s V4-Pro model is designed for complex reasoning, code generation, and conversational tasks. Its cheaper sibling, V4-Flash, has been widely used for lighter tasks, but the Pro variant previously carried a premium because it required access to advanced compute hardware. At launch, DeepSeek acknowledged that pricing for Pro was up to 12 times higher than Flash due to limited access to high-end AI chips, a direct consequence of US export controls that restrict sales of NVIDIA’s A100, H100, and newer Blackwell GPUs to China. These chips are the gold standard for training and inference in Large Language Models (LLMs), and their unavailability forced Chinese firms to seek alternatives.
Huawei’s Ascend chips become a game changer
Industry analysts immediately pointed to Huawei’s Ascend line of AI accelerators as the likely enabler of DeepSeek’s price reduction. The Ascend 910 series, and more recently the 950 series, have emerged as the leading homegrown alternative to NVIDIA’s products. Huawei has been scaling production of these chips despite facing its own manufacturing bottlenecks due to US sanctions on advanced chipmaking equipment. Nevertheless, the company’s efforts appear to be bearing fruit. DeepSeek has not officially confirmed the shift, but the timeline aligns with reports that Chinese data centers are increasingly deploying Ascend clusters to service AI model providers.
The Ascend 950 chip, launched in 2024, offers competitive performance for AI inference tasks, especially for relatively small models like V4-Pro (estimated to have around 70 billion parameters). Its improved memory bandwidth and interconnect capabilities make it suitable for serving LLMs at scale. While still trailing NVIDIA’s top-tier GPUs in raw training performance, the Ascend 950 is a viable option for inference, which is the stage where a model is deployed to answer user queries or generate content. By moving more inference workload to Ascend-based servers, DeepSeek can significantly cut compute costs, passing savings to developers.
Financial and strategic implications
The price cut is not just a technical achievement; it is a strategic chess move. DeepSeek is effectively challenging the notion that high-quality AI inference must remain expensive. The company’s early emphasis on cost efficiency has been a differentiating factor in a market where OpenAI charges roughly $15 per million tokens for GPT-4 and Anthropic charges $10 for Claude 3.5. However, those prices are in US dollars, while DeepSeek’s new pricing in yuan is even cheaper when converted: 0.025 yuan is about $0.0035, and 6 yuan is about $0.83. This creates a massive price advantage, even accounting for differences in model performance and reliability.
Western AI providers have been under pressure to reduce costs as competition intensifies. OpenAI reduced prices for its GPT-3.5 Turbo earlier in 2024 and continues to optimize model efficiency. Anthropic and Google have also introduced cheaper tiers. A 75% cut by a Chinese rival may push them to accelerate efficiency gains or accept thinner margins on inference services. The global AI price war is heating up, and the battleground is shifting from just training costs to the cost of serving models at scale—known as inference costs.
DeepSeek’s move also puts pressure on other Chinese AI startups like Baidu’s Ernie, Alibaba’s Tongyi Qianwen, and ByteDance’s Doubao. These companies have their own large models and have been competing on token pricing. DeepSeek’s aggressive pricing may force them to either match the reduction or risk losing developer mindshare. The Chinese AI market is notoriously competitive, with dozens of model providers vying for users. Price cuts of this magnitude could trigger a race to the bottom, benefitting consumers and businesses but challenging the profitability of some firms.
Broader context: US export controls and Chinese self-sufficiency
The development underscores the ongoing impact of US export controls on advanced semiconductors. The Biden administration’s October 2022 and subsequent updates banned the sale of NVIDIA’s A100 and H100 chips to China, as well as any chip with equivalent performance. These restrictions were designed to hinder China’s military modernization, but they also inadvertently accelerated the country’s push for self-sufficiency in AI hardware. Chinese companies from Huawei to local startups like MetaX and Enflame have now developed competitive accelerators. Huawei’s Ascend series has become the most prominent, with the 910B and 950 chips being used by dozens of AI firms for both training and inference.
Still, challenges remain. Huawei’s production capacity is constrained by limited access to EUV lithography equipment needed for cutting-edge 7nm and below processes. The company has managed to produce the Ascend 950 using its own modified process, but yields may be lower than TSMC’s. Additionally, software ecosystems like CUDA are dominated by NVIDIA; Huawei has its own Cann framework, but it is not as widely adopted. DeepSeek’s ability to use Ascend effectively suggests that the software gap is narrowing, at least for certain applications.
If Chinese firms can continue scaling AI performance while reducing costs, the global balance of AI capabilities may shift. Already, analysts note that China has produced large models that rival the best from the US in specific benchmarks, such as coding and mathematics. The efficiency gains from hardware customization and software optimization could allow Chinese companies to offer AI services at a fraction of the cost of their Western peers, potentially drawing international customers who care about price over geopolitical ties.
DeepSeek’s price cut may be an early signal of this trend. As more Chinese data centers adopt Ascend and other domestic chips, the infrastructure bottleneck that has kept AI costs high will slowly dissolve. The company’s move is a bet that hardware is no longer the limiting factor—at least for inference. And if that bet pays off, the entire AI industry could see a fundamental rethinking of pricing models. The quiet announcement of a 75% reduction in price may well be echoed by other players in the coming months, reshaping the economics of artificial intelligence worldwide.
Source: Digital Trends News