Deepseek: everything you need to know about the chatbot app

Deepseek: everything you need to know about the chatbot app


Deepseek has become viral.

The Chinese laboratory at the Lab Deepseeks broke into the traditional conscience this week after its app chatbot rose to the top of the Apple Apple stores. Deepseek’s AI models, which were trained using calculation efficiency techniques, have guided the analysts of Wall Street-and the technologists-to wonder if the United States can maintain its advantage in the AI ​​race and if the demand for chip to support.

But where does Deepsek come from and how did he go up to international fame so quickly?

Origins of the Trader of Deepseek

Deepseek is supported by High-Flyer Capital Management, a Chinese quantitative Hedge Fund that uses to inform its trading decisions.

The enthusiasm of the Liang Wenfeng co-founded High-Flyer in 2015. Wenfeng, who according to what reported has started to delight in trading while a student at Zhejiang University, launched High-Flyer Capital Management as Hedge Fund in 2019 focused on the development and implementation of artificial intelligence algorithms.

In 2023, High-Flyer began Deepseek as a laboratory dedicated to the search for artificial intelligence tools separated from its financial activity. With High-Flyer as one of its investors, the laboratory has turned into its own company, she also called Deepseek.

From the first day, Deepseek has built its data center clusters for the formation of the model. But like other artificial intelligence companies in China, Deepseek has been affected by the American export bans on hardware. To form one of its most recent models, the company has been forced to use the NVIDIA H800 chips, a less powerful version of a chip, the H100, available for US companies.

Deepseek technical team is said to distort young people. According to reports, the company recruits aggressively doctoral researchers of the best Chinese universities. Deepseek also takes people without any computer background to help his technology better understand a wide range of topics, according to the New York Times.

The strong models of Deepseek

Deepseek has revealed his first series of programmer models-deepseek, deepseek llm and deepseek chat-in November 2023. But it was not until last spring, when the start published his family of deepseek-v2 models next generation, that industry AI began to take note of it.

Deepseek-V2, a text and image analysis system for the general purpose, has worked well in various artificial intelligence benchmark-and was much cheaper to perform than the comparable models at the moment. He forced the national competition of Deepseek, including Bytedance and Alibaba, to cut the use prices for some of their models and make others completely free.

Deepseek-V3, launched in December 2024, was added only to Deepseek’s reputation.

According to the internal reference tests of Deepseek, Deepseek V3 surpasses both downloadable models, available openly such as the Meta blade and the “closed” models that can only be accessed through an API, such as Openai GPT-4o.

Equally impressive is the model of “reasoning” of Deepseek. Released in January, Deepseek says that R1 performs and the O1 model of Openai on the key parameters.

Being a reasoning model, R1 occurs effectively itself, which helps it to avoid some of the pitfalls that normally stumble on the models. The reasoning models require a little more time-on usual a few or minutes longer-to get to solutions than a typical non-reduced model. The positive side is that they tend to be more reliable in sectors such as physics, science and mathematics.

There is a negative aspect for R1, Deepseek V3 and the other Deepseek models. Being developed in Chinese, they are subject to benchmarking by the Chinese regulator of the Internet to ensure that its responses “embodies the fundamental socialist values”. In the DeePseek chatbot app, for example, R1 will not answer questions about Tiananmen Square or Taiwan’s autonomy.

A disruptive approach

If Deepseek has a business model, it is not clear what that model is exactly. The company prices its products and services well below the market value and offers others for free.

The way Deepseek says so, the efficiency discoveries have allowed him to maintain the competitiveness of the extreme cost. However, some experts contest the figures that the company has provided.

Whatever the case, the developers have taken the Deepseek models, which are not open source as the phrase is commonly understood but are available in permissive licenses that allow commercial use. According to Clem Delague, the CEO of Hugging Face, one of the platforms that houses the Deepseek models, the developers on Hugging Face have created over 500 “derivatives” models of R1 which have collected 2.5 million combined downloads.

Deepseek’s success against the biggest and most established rivals has been described as “Ai a chief of Ai” and inaugurated in “a new era of ai brinkmanship”. The success of the company was at least partially responsible for the fact that the price of the Nvidia shares dropped by 18% on Monday and for having aroused a public response from the CEO of Openi Sam Altman.

As for what could contain the future of Deepseek, it is not clear. The improved models are a data. But the United States government seems to be wary of what it perceives as a harmful foreign influence.

Techcrunch has a newsletter focused on artificial intelligence! Sign up here to take it to your mailbox every Wednesday.

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply

Your email address will not be published. Required fields are marked *