SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Strategies & Market Trends : 2026 TeoTwawKi ... 2032 Darkest Interregnum
GLD 416.72+1.2%Dec 26 4:00 PM EST

 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext  
To: Julius Wong who wrote (210717)1/30/2025 8:41:06 AM
From: TobagoJack  Read Replies (2) of 218715
 
Re <<DeepSeek … big signals … assumption>> … if true, news is on the OMG level … coding was done on using machine language, and … too scary to think about implications … shall appraise the Jack of the situation … that (1) DS learning done using Huawei chip, (2) can use any other normal CPU, and (3) Nvidia GPU not necessary

In the meantime,
nature.com

Scientists flock to DeepSeek: how they’re using the blockbuster AI model

Researchers are testing how well the open model can perform scientific tasks — in topics from mathematics to cognitive neuroscience.
29 January 2025
The DeepSeek model is accessible through a chatbot app.Credit: Mladen Antonov/AFP via Getty

Scientists are flocking to DeepSeek-R1, a cheap and powerful artificial intelligence (AI) ‘reasoning’ model that sent the US stock market spiralling after it was released by a Chinese firm last week.

Repeated tests suggest that DeepSeek-R1’s ability to solve mathematics and science problems matches that of the o1 model, released in September by OpenAI in San Francisco, California, whose reasoning models are considered industry leaders.

Although R1 still fails on many tasks that researchers might want it to perform, it is giving scientists worldwide the opportunity to train custom reasoning models designed to solve problems in their disciplines.

“Based on its great performance and low cost, we believe Deepseek-R1 will encourage more scientists to try LLMs in their daily research, without worrying about the cost,” says Huan Sun, an AI researcher at Ohio State University in Columbus. “Almost every colleague and collaborator working in AI is talking about it.”

Open seasonFor researchers, R1’s cheapness and openness could be game-changers: using its application programming interface (API), they can query the model at a fraction of the cost of proprietary rivals, or for free by using its online chatbot, DeepThink. They can also download the model to their own servers and run and build on it for free — which isn’t possible with competing closed models such as o1.

Since R1’s launch on 20 January, “tons of researchers” have been investigating training their own reasoning models, based on and inspired by R1, says Cong Lu, an AI researcher at the University of British Columbia in Vancouver. That’s backed up by data from Hugging Face, an open-science repository for AI that hosts the DeepSeek-R1 code. In the week since its launch, the site had logged more than three million downloads of different versions of R1, including those already built on by independent users.

Scientific tasksIn preliminary tests of R1’s abilities on data-driven scientific tasks — taken from real papers in topics including bioinformatics, computational chemistry and cognitive neuroscience — the model matched o1’s performance, says Sun. Her team challenged both AI models to complete 20 tasks from a suite of problems they have created, called the ScienceAgentBench. These include tasks such as analysing and visualizing data. Both models solved only around one-third of the challenges correctly. Running R1 using the API cost 13 times less than did o1, but it had a slower “thinking” time than o1, notes Sun.

R1 is also showing promise in mathematics. Frieder Simon , a mathematician and computer scientist at the University of Oxford, UK, challenged both models to create a proof in the abstract field of functional analysis and found R1’s argument more promising than o1’s. But given that such models make mistakes, to benefit from them researchers need to be already armed with skills such as telling a good and bad proof apart, he says.

Much of the excitement over R1 is because it has been released as ‘open-weight’, meaning that the learnt connections between different parts of its algorithm are available to build on. Scientists who download R1, or one of the much smaller ‘distilled’ versions also released by DeepSeek, can improve its performance in their field through extra training, known as fine tuning. Given a suitable data set, researchers could train the model to improve at coding tasks specific to the scientific process, says Sun.
Report TOU ViolationShare This Post
 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext