Deepseek R1: Why do experts in AI think it is so special?- BC

Deepseek R1: Why do experts in AI think it is so special?– BC

Suddenly, Depseek is everywhere.

Its R1 model is open source, supposedly trained by a fraction of the cost of other AI models, and is equally good, if not better than chatgpt.

This lethal combination hit Wall Street hard, causing technological actions to fall and causing investors to question how much money is needed to develop good AI models. Deepseek engineers claim R1 was trained with 2,788 GPUs that cost around $ 6 million, compared to OpenAi GPT-4 that reportedly Cost $ 100 million to train.

Deepseek’s custom also challenges the idea that the largest and more data models lead to better performance. In the midst of frantic conversation about Depseek’s capabilities, its threat to AI companies such as OpenAi and scared investors, it can be difficult to make sense of what is happening. But experts in veteran experience have intervened with valuable perspectives.

Deepseek demonstrates what IA experts have been saying for years: bigger is not better

Oblighted by commercial restrictions and access to the NVIDIA GPUs, Deepseek, based in China, had to be creative in the development and training R1. That they could achieve this feat for only $ 6 million (which is not much money in terms of AI) was a revelation for investors.

But AI experts were not surprised. “In Google, I asked why they were obsessed with the construction of the largest model. Why are you going for size? What function are you trying to achieve? Why does it bother you that you didn’t have the biggest model? He replied shooting me.” aware Timnit Gebru, who was fired from Google for calling Ai Bias, in X.

britcommerce light speed

Hugged faceThe climate and the AI ​​leader, Sasha Luccioni, pointed out how AI’s investment is based precariously on marketing and advertising. “It is wild that insinuating that a single LLM (high performance) is able to achieve that performance without forcing thousands of GPUs gross to cause this.” saying Lucccioni.

Clarifying why Deepseek R1 is so important

Deepseek R1 comparable to the OpenAI O1 model at key reference points. He marginally exceeded, matched or fell just below O1 in the tests of knowledge, coding and general knowledge. That is, there are other models out there, such as Anthrope Claude, Google Gemini and Meta’s Open Source Model calls that they are so capable for the average user.

But R1 causes such frenzy due to how little it costs to do. “It is not smarter than the previous models, only more economically trained,” saying The research scientist of the Gary Marcus.

The fact that Depseek has been able to build a model that competes with Operai models is quite remarkable. Andrej Karpathy, who co -founded Openai, aware In X, “does this mean that you don’t need large GPU groups for Frontier LLM? Put in favor of data and algorithms.”

Wharton ai Professor Ethan Mollick saying It is not about their abilities, but models that people currently have access. “Deepseek is a really good model, but it is generally not a better model than O1 or Claude,” he said. “But since it is free and receives a ton of attention, I think that many people who used free ‘mini’ models are exposed to what a 2025 reasoner can do and are surprised.”

Score one for open source AI models

Deepseek R1 Breakout is a great victory for open source defenders who argue that democratizing access to powerful models guarantees transparency, innovation and healthy competition. “For people who think that ‘China is surpassing the US saying Yann Lecun, head scientist of Meta, who has supported the open supply with its own flame models.

The computer scientist and an expert at IA Andrew NG did not explicitly mention the importance of R1 is an open source model, but highlighted how Deepseek’s interruption is a blessing for developers, since it allows access that is otherwise gatekept by A great technology.

“The ‘Sellto de Deepseek’ today in the stock market, attributed to Deepseek v3/r1 that interrupts the technological ecosystem, is another sign that the application layer is a great place to be”, saying Ng. “The base model layer that is hypercompetitive is ideal for people who build applications.”

Topics
Artificial Intelligence Deepseek

Leave a Comment

Your email address will not be published. Required fields are marked *

Shopping Cart
Scroll to Top