The Rise of DeepSeek
Advertisements
- May 4, 2025
- Insurance Directions
- 14
Introduction
Recently, the AI language model called DeepSeek has made waves, earning it the title of "a product of Eastern mystical power" among tech enthusiasts around the worldThis homegrown generative language model has captured the attention of all who look forward to the future of technology.
DeepSeek actually first made headlines late last year with its R1 model, igniting discussions within the AI communityHowever, it wasn't until the recent surge in public interest that it caught the eye of mainstream media in China.
As of now, DeepSeek tops the free download charts in app stores, both domestically and internationally.
The sudden influx of new users led to unexpected server overloads, causing several crashes as the infrastructure struggled to keep pace.
The intense popularity of DeepSeek has quickly made it the hottest topic in conversations leading up to the Year of the Snake Lunar New Year celebrations.
What accounts for DeepSeek's astonishing rise among various large language models (LLMs) and its global breakout? I believe there are four key factors at play:
Firstly, the latest version of DeepSeek, V3 and R1, exhibits significant technical innovations tailored for training LLMs.
It builds on the Attention architecture with optimizations in its engineering structure, such as MLA and DeepSeek MOE.
Moreover, the training methodologies have seen innovation, including a focus on reinforcement learning (RL), the elimination of supervised fine-tuning, the implementation of expert loading balance techniques, and a dual pipeline scheduling mechanism, which essentially conserves GPU computational powerThis makes communication allocation per token more efficient, dramatically reducing unnecessary consumption and making the training process more energy-efficient.
Secondly, DeepSeek's requirements for computational power and equipment are lower than those of its competitors, which translates to reduced costs.
Its parent company reportedly stockpiled a substantial number of high-end NVIDIA A100 graphics cards before chip export bans, allowing them to create a training cluster for their models.
This results in the research and development costs for DeepSeek's latest models being merely one-seventh that of OpenAI's ChatGPT series or one-tenth of Meta's Llama product line.
DeepSeek's founder, Liang Wenfeng, mentioned in a recent interview that while funding for R&D is not an issue, navigating the challenges posed by high-end chip export bans has proven to be quite tricky.
This means that when it comes to computational power and equipment, constraints still exist.
Some foreign experts think that DeepSeek's achievements are not as monumental as they seem, asserting it to be a victory powered mainly by open-source technology
Advertisements
They argue that if high-end chip export bans persist, this could be the only sliver of hope left.
The third key reason behind DeepSeek’s shockwaves in Silicon Valley can be attributed to a couple of points:
To begin with, the company has awakened Silicon Valley elites to the realization that, besides local innovations across the Pacific, China is also making strides in LLM technology, and its performance is commendableThis fundamentally shatters the stereotype that “China only does application innovation.”
The second point involves DeepSeek’s surprisingly efficient product iteration rates with relatively low financing, which adversely affects the business narratives centered around massive computational powerFor instance, following a sharp decline in stock prices, NVIDIA’s market capitalization dropped by over 8% as of January 27, while TSMC’s fell by more than 11%. This trend also negatively impacted major Chinese firms like Tencent, ByteDance, and Alibaba.
As a result, this phenomenon not only puts pressure on U.S. stocks but could also revolutionize the entire industry model surrounding LLM development, even though the foundational logic remains unchanged.
Fourthly, the most significant contribution to DeepSeek's success lies in the role of open-source technology, which is undeniably a positive impactHowever, this article will focus on the negative aspects that demand attention, which cannot be adequately addressed in just a few sentences, so I will reserve those discussions for the final section.
In fact, DeepSeek's rise poses no threat to the AI landscape on the other side of the Pacific; rather, it can generate three substantial advantages:
Firstly, it compels large corporations involved in AI LLM training to maintain transparency in funding, especially towards shareholders and other direct stakeholders.
Secondly, it allows small to medium enterprises in Silicon Valley and other regions to find hope, helping them escape the long-standing dominance of larger players in the funding market, thereby enabling them to acquire relatively equal financing support.
Thirdly, given that DeepSeek emerges from one of A-shares' largest investment firms, it offers regulatory bodies across the Pacific the chance to recognize the feasibility and effectiveness of utilizing highly logical financial transaction data in AI training
Advertisements
This could lead to normalized regulations and guide Wall Street financial data to be correctly utilized in training AI models, enhancing various capabilities like reasoning and analysis.
Moreover, DeepSeek’s advent signifies more than just a solid step forward in global technological innovation for China; it also highlights areas of critical importance.
The first is the necessity for self-reliance in both hardware and software development, especially given the continuous external pressures and sanctions.
The second emphasizes the necessity of showcasing the efficiency-oriented growth approaches of private enterprises and non-public entities, which are vital for China's macroeconomic transformation, broader reform, and comprehensive industrial upgrade strategies.
Additionally, there are individuals, both aware and unaware, advocating for the necessity of closed-source technologies due to nationalistic sentimentsThis phenomenon around DeepSeek can be interpreted as a form of backward "cognitive warfare." However, it also stands as China's introduction to countering similar cognitive or narrative battles externally.
This brings me to ponder: from a social sciences perspective instead of a natural sciences lens, observing DeepSeek's rise might reveal a profoundly negative developmental inclination, rooted in the opposing unity of open-source and closed-source paradigms.
The greatest challenge humanity faces today is not whether AI will be misused; rather, it revolves around relatively closed societies leveraging their scale and efficacious governance to plunder resources that align with open-source systemsThis situation is notably reminiscent of historical colonial conquests and is the fundamental reason behind external sanctions and curtailments.
If this Leviathan were a society embodying modern principles, prioritizing individual property rights and ethical governance, it might serve as a “necessary evil” towards supplanting the previous paradigm.
However, it still operates on pre-modern, enterprising mindsets while grappling with substantial societal costs
Advertisements
This conflict has persistently resulted in cyclical turmoil for its citizens and complicates their awareness of the significant impacts such scale brings to the broader global context.
Should this contemporary Leviathan pursue virtuous means, humanity could benefit; otherwise, the reverse holds true.
It’s unfortunate to observe that no humanities and social sciences scholar has articulated these insights clearly, nor do the stewards of this Leviathan grasp the forthcoming trends.
My position is transparent: to eliminate nationalist narratives and approach issues from a global human perspective.
Consider the scenario where the open-source community sees a significant influx of participants from relatively closed societies who exploit technological advantages to maximize benefits for their own LeviathanThis could suppress the practical applications of open-source technologies elsewhere, and the emphasis on maximizing profits would be undermined by the siphoning effects generated by their scale.
The outcome would not represent a jointly beneficent technological innovation led by that Leviathan but rather result in an intensified wealth gap akin to the oppressive hierarchies of the past.
Furthermore, as human nature inclines towards equality, the once-dominant ideologies of the open-source community may swiftly diminish, leaving only those from the Leviathan holding the banners of innovation and open-source ethos.
Yet, within that Leviathan, the narrative forces that maintain its sovereignty are already declining, preventing sustainable appropriation from the open-source ecosystem.
Looking towards the future, other Leviathans will either prepare defensive strategies against this siphoning or, if succumbing to complete appropriation, inevitably share in both gains and losses alike.
This paints a dystopian picture of a reality where open societies and open-source systems could be utterly subdued by their closed counterparts, akin to a dimensional assault.
Humanity would find itself entrenched in a “technological stagnation” era—an echo of historical periods where societal productivity barely advanced.
Thus, the innate Leviathan continues, persisting like the eternal night.
Conclusion
In summary, the emergence of the Chinese AI language model DeepSeek represents a significant addition to the global open-source community, showcasing the intelligence and innovation of its people
Advertisements
Advertisements
Leave a Comment