✨ DeepSeek drama: The optimistic case for AI spending momentum
Plenty of historical precedent and the Jevons Paradox suggest AI efficiency gains will drive higher, not lower, total spending
I wanted to do a short piece/roundup on the continuing questions about the impact of Chinese startup DeepSeek and its powerful-but-cheap AI model. The company's apparent breakthrough has challenged the assumption that elite models require lots of expensive and energy-gobbling chips like those from Nvidia, whose shares investors hammered on Monday.
And while major US tech companies are investing heavily in advanced AI infrastructure — with firms like SoftBank and OpenAI recently pledging $500 billion for a venture called Stargate — DeepSeek has achieved comparable AI performance using cheaper, less-advanced chips and innovative training techniques. (DeepSeek claims it trained the underlying V3 model for just $5.6 million total: $5.3 million for pre-training, $200,000 for context extension, and $10,000 for post-training.) The company has open-sourced its R1 model, allowing others to build on its cost-effective approach.
It’s worth noting that some on Wall Street a) would like more clarity on DeepSeek’s cost claims and b) are far from ready to declare over the momentum on AI infrastructure spending.
Keep reading with a 7-day free trial
Subscribe to Faster, Please! to keep reading this post and get 7 days of free access to the full post archives.