SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Strategies & Market Trends : Technical analysis for shorts & longs
SPY 695.59+0.4%Jan 27 4:00 PM EST

 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext  
To: Johnny Canuck who wrote (59398)7/6/2024 1:31:30 PM
From: Johnny Canuck  Read Replies (1) of 70132
 

archive.today
webpage capture
Saved from

history ?prior
next?

6 Jul 2024 16:21:37 UTC
Redirected from

no other snapshots from this url

All snapshotsfrom host www.wsj.com
Webpage Screenshot
share download .zip report bug or abuse Buy me a coffee


Subscribe Sign In




  1. TECHNOLOGY

  2. ARTIFICIAL INTELLIGENCE



For AI Giants, Smaller Is Sometimes Better
Companies are turning their attention to less powerful models, hoping lower costs and solid performance will win more customers

Tom Dotan

and
Deepa Seetharaman

July 6, 2024 5:30 am ET

ILLUSTRATION: EMIL LENDOF, ISTOCK
The start of the artificial-intelligence arms race was all about going big: Giant models trained on mountains of data, attempting to mimic human-level intelligence.
Now, tech giants and startups are thinking smaller as they slim down AI software to make it cheaper, faster and more specialized.

This category of AI software—called small or medium language models—is trained on less data and often designed for specific tasks.
The largest models, like OpenAI’s GPT-4, cost more than $100 million to develop and use more than one trillion parameters, a measurement of their size. Smaller models are often trained on narrower data sets—just on legal issues, for example—and can cost less than $10 million to train, using fewer than 10 billion parameters. The smaller models also use less computing power, and thus cost less, to respond to each query.
Microsoft MSFT 1.47%increase; green up pointing triangle has played up its family of small models named Phi, which Chief Executive Satya Nadella said are 1/100th the size of the free model behind OpenAI’s ChatGPT and perform many tasks nearly as well.
“I think we increasingly believe it’s going to be a world of different models,” said Yusuf Mehdi, Microsoft’s chief commercial officer.
Microsoft CEO Satya Nadella has said the company’s small models are 1/100th the size of the free model behind ChatGPT. PHOTO: JASON REDMOND/AGENCE FRANCE-PRESSE/GETTY IMAGES
Microsoft was one of the first big tech companies to bet billions of dollars on generative AI, and the company quickly realized it was becoming more expensive to operate than the company had initially anticipated, Mehdi said.
The company also recently launched AI laptops that use dozens of AI models for search and image generation. The models require so little data that they can be run on a device and don’t require access to massive cloud-based supercomputers as ChatGPT does.
Google—as well as AI startups Mistral, Anthropic and Cohere—have also released smaller models this year. Apple unveiled its own AI road map in June with plans to use small models so that it could run the software entirely on phones to make it faster and more secure.
Even OpenAI, which has been at the vanguard of the large-model movement, recently released a version of its flagship model it says is cheaper to operate. A spokeswoman said the company is open to releasing smaller models in the future.
For many tasks, like summarizing documents or generating images, large models can be overkill—the equivalent of driving a tank to pick up groceries.
“It shouldn’t take quadrillions of operations to compute 2 + 2,” said Illia Polosukhin, who currently works on blockchain technology and was one of the authors of a seminal 2017 Google paper that laid the foundation for the current generative AI boom.
OpenAI offices in San Francisco. The company recently released a version of its flagship model that it says is cheaper to operate. PHOTO: CLARA MOKRI FOR THE WALL STREET JOURNAL
Businesses and consumers have also been looking for ways to run generative AI-based technology more cheaply when its returns are still unclear.
Because they use less computing power, small models can answer questions for as little as one-sixth the cost of large language models in many cases, said Yoav Shoham, co-founder of AI21 Labs, a Tel Aviv-based AI company. “If you’re doing hundreds of thousands or millions of answers, the economics don’t work” to use a large model, Shoham said.
The key is focusing these smaller models on a set of data like internal communications, legal documents or sales numbers to perform specific tasks like writing emails—a process known as fine-tuning. That process allows small models to perform as effectively as a large model on those tasks at a fraction of the cost.
“Getting these smaller, specialized models to work in these more boring but important areas” is the frontier of AI right now, said Alex Ratner, co-founder of Snorkel AI, a startup that helps companies customize AI models.
The credit-rating company Experian EXPN -0.63%decrease; red down pointing triangle shifted from large models to small for the AI chatbots it uses for financial advice and customer service.
Once trained on the company’s internal data, the smaller models performed as well as large ones at a fraction of the cost, said Ali Khan, Experian’s chief data officer.
The models “train on a well-defined problem area and set of tasks, as opposed to giving me a recipe for flan,” he said.
Experian is using small models for the AI chatbots it uses for financial advice and customer service. PHOTO: BING GUAN/BLOOMBERG NEWS
The smaller models also are faster, said Clara Shih, head of AI at Salesforce.
“You end up overpaying and have latency issues” with large models, Shih said. “It’s overkill.”
The move to smaller models comes as progress on publicly released large models is slowing. Since OpenAI last year released GPT 4, a significant advance in capabilities from the prior model GPT 3.5, no new models have been released that make an equivalent jump forward. Researchers attribute this to factors including a scarcity of high-quality, new data for training.
That trend has turned attention to the smaller models.
“There is this little moment of lull where everybody is waiting,” said Sébastien Bubeck, the Microsoft executive who is leading the Phi model project. “It makes sense that your attention gets diverted to, ‘OK, can you actually make this stuff more efficient?’”

SHARE YOUR THOUGHTS
What is your outlook on the race over small language models? Join the conversation below.

Whether this lull is temporary or a broader technological issue isn’t yet known. But the small-model moment speaks to the evolution of AI from science-fiction-like demos to the less exciting reality of making it a business.
Companies aren’t giving up on large models, though. Apple announced it was incorporating ChatGPT into its Siri assistant to carry out more sophisticated tasks like composing emails. Microsoft said its newest version of Windows would integrate the most recent model from OpenAI.
Still, both companies made the OpenAI integrations a minor part of their overall AI package. Apple discussed it for only two minutes in a nearly two-hour-long presentation.
— Berber Jin contributed to this article.
Write to Tom Dotan at tom.dotan@wsj.com and Deepa Seetharaman at deepa.seetharaman@wsj.com

The Global AI Race

Coverage of ChatGPT and other advancements in artificial intelligence, selected by the editors

GET WSJ'S TECH NEWSLETTER

Amazon, Built by Retail, Invests in Its AI Future
A Guide to Apple’s AI-In-Everything Strategy
How a Chatbot Helped Me Talk to My Dead Mother
Microsoft’s Nadella Is Building an AI Empire
Apple Introduces New OpenAI Partnership as AI Takes Center Stage
The Opaque Investment Empire Making OpenAI’s Sam Altman Rich
The Great AI Challenge: We Test Five Top Bots on Useful, Everyday Skills
The Jobs Most Exposed to ChatGPT

Copyright ©2024 Dow Jones & Company, Inc. All Rights Reserved. 87990cbe856818d5eddac44c7b1cdeb8

HIDE CONVERSATION (29)

Conversation30 Comments

By joining the conversation you are accepting our community rules and terms. To view rules, terms and FAQs, click here. Questions can be sent to moderator@wsj.com.

Conversations on news articles and news columns must be kept on the topic of the story. In opinion content, conversations can delve into other subjects. The conversation will close on this article four days after publication.

What do you think?

Sort by
Newest



  • LL

    Lawrence L

    5 minutes ago



    The power consumption is outrageous. They will eat up all the sable power and leave us with the power that shuts down on a cloudy day



    Reply·

    ·
    Share



  • SH

    Shoko Hoshi

    6 minutes ago



    These are essentially AI agents. A lot of companies are developing it. They span very broadly -- from email marketing to predicting an older person's gait. Yes, a you can do with a lot less dataset. And it's not just L or Language anymore. Models can encompass pictures (= frame of a video) as well.
    (Edited)


    Reply·

    1

    ·
    Share



  • TM

    Tom Medl

    35 minutes ago



    Smaller models are also "greener" models



    Reply·

    ·
    Share



  • KW

    ken warren

    1 hour ago



    Should there be an EPA Environmental Impact Assessment on Ai and Bitcoin "mining," and cost and disposal of EV car batteries? Do we need to make more (sometimes risky like nuclear) energy to support arbitrary initiatives, or do we use real human intelligence to do a basic cost benefit analysis?



    Reply·

    ·
    Share



  • TO

    Tim O

    1 hour ago



    any one know if ai can put dog faces on twisted sisters singing we aint gonna take it anymore? asking for a dog freaked out by july 4th boom booms.



    Reply·

    ·
    Share



    • BI

      betsy ilfeld

      28 minutes ago



      Ask your vet about an anti-anxiety medication. It worked for our two dogs this week.



      Reply·

      1

      ·
      Share





  • DR

    Dev Roy

    1 hour ago



    Mustafa Suleyman on the Product-Led AI Greymatter Podcast also agrees with the premise of this article. He believes that small open-source models and proprietary data are the way forward. As a practitioner, I usually take a performant small model and fine-tune it to achieve more controllable and predictable outputs for my task than a large, expensive model, which makes deployment impossible and uneconomical. Training these small models costs a few thousand dollars, not the millions. I see some value for LLMs today in problems where imprecision is a virtue given their stochastic nature, such as composing emails, marketing copy, etc. My view from the trenches is that I cannot get my head around the increase in value of NVDA stock, given that most of the compute being used currently is training and not inference, i.e. value is not yet accruing to customers and businesses don't like spending money on things that don't give returns.
    (Edited)


    Reply·

    3

    ·
    Share



  • JM

    Jim McCreary

    1 hour ago



    FINALLY, we're seeing some signs of HUMAN intelligence in the AI hype-fest.

    That said, your Large-vs-Small distinction still misses the essential distinguishing dimension, which is Fact-based vs Opinion-based. The only way serious professional people are going to RELY on AI-generated output is when those people KNOW that their AI systems are using 100% Real-World operating data which was captured as machine-measured inputs and outputs in real time, so that the underlying datasets have NO human opinion-based "editing" in them.

    100% Fact Based AI systems will, by the nature of the underlying datasets, have to be specialized tools which work very reliably in specific vertical-application niches. So the wave of the future for wise AI strategists will be to put together easy-to-use suites and portfolios of 100% Fact Based AI tools, which will offer context-based help when they observe users doing things by hand that the systems think they know how to do automatically. Under that structure, users can opt in or opt out to system help, and will know every time that they are allowing the system in, or not. For infotech companies, that approach offers an almost infinite array of product possibilities, for an infinite array of user types. Thus we might get to a world in which a lot of competing infotech companies serve differing customer types very well, with lots of competitive overlap to keep everybody honest.

    So, 100% Fact Based AI is a big opportunity set. Git 'er done!



    Reply·

    ·
    Share



  • DT

    Debbie C Tamayo

    1 hour ago



    Ok, what can the larger models do?
    Debb’s hubby



    Reply·

    ·
    Share



  • PD

    Peter DeWeese

    1 hour ago



    This has been in play for a long time. Apple seems behind on LLVMs, but was ahead in the niche game when they released AI dev tools years ago and at the same time integrated processor features, realtime green screening using trained models, etc. I think the mundane niche features are far more ready today and do more to improve software features.



    Reply·

    ·
    Share



  • DP

    DAVID P

    1 hour ago



    The AI startups like Mistral, Anthropic and Cohere and AI21 Labs are called startups for a reason. Experimental and some will succeed and some will not. Nvidia stock has soared into the stratosphere with all AI hype heaped upon it. It will take years for AI to sort itself out and likely the genuses in DC will interfere in someway. We shall see if smaller is better as the title of this article claims?
    (Edited)


    Reply·

    ·
    Share



  • Martin Kubalanza

    2 hours ago



    typical in the soft ware industry



    Reply·

    ·
    Share



  • GK

    Gene Kor

    2 hours ago



    GPT5 is supposedly done with training and going through its safety review. Will see if this is all hype and LLMs hit the brick wall.



    Reply·

    ·
    Share



  • JA

    John ALEXANDER

    2 hours ago



    This is true for LLMs, which are just predictive engines trying to guess the most correct answer. But for AGI -- actual general intelligence that will be able to learn on its own -- this will continue to be the domain of massive computing complexes. And, we are nowhere near AGI today, and some AI researchers feel that we will not get there with LLMs. A new approach is needed.

    LLMs may be stuck as "assistants" for some time until supplanted by AI built from a different paradigm. It is intriguing and eye-opening what LLMs can do, but there seems to be a ceiling.



    Reply·

    ·
    Share



  • SS

    Stephpen Siu

    2 hours ago



    Society works best when there is a diversity of thoughts and traits - people who like to focus deeply vs those think in big pictures.
    It makes sense for the AI ecosystem too.



    Reply·

    ·
    Share



  • GH

    Gerald Hanweck

    3 hours ago



    When AI gets down to the size of a crow, call me. Crows are highly talented navigators and problem solvers. I'd wager on the crow over an existing self-driving car. And don't get me started on honeybees. AI should fit on a small chip and use very little energy. "History shows again and again how nature points out the folly of man." (BÖC)



    Reply·

    3

    ·
    Share



    • CM

      Connie Matia

      23 minutes ago



      Before there was the iPhone, there was the Macintosh.



      Reply·

      ·
      Share





  • RC

    Rick C

    3 hours ago



    The internet is full of huge quantities of garbage: rumor, opinion and hearsay. How is training on garbage supposed to produce quality interpolation/extrapolation (which is ai at its core)? We all know the adage, “I read it on the internet so it must be true”.

    Has anyone run a quality check on ai? Here’s a simple one. Somebody sits down and composes an email the old fashioned way, by typing it out after thinking about the subject. Then local that away so ai can’t find it. Ask ai to compose an email on the same subject. Then compare. Do the same for a meeting summary and see how well ai does there.

    Been in the Appleverse for quite a while. Will be the first to acknowledge Siri needs a lot of work. Not enough room here to list the deficiencies.



    Reply·

    1

    ·
    Share



    • CE

      Carlton Ellis

      1 hour ago



      Without KNOWING what data sets a language model is training, your comment is more speculative than factual.

      The gist of the article is that companies are building smaller language models with more specialized data.



      Reply·

      ·
      Share





  • EK

    Elena K

    3 hours ago



    Nice visual



    Reply·

    ·
    Share



  • Shaker Cherukuri

    3 hours ago



    This is very similar to what automotive system design engineers have been doing since the 90s albeit on well defined sensor data which is structured as opposed to unstructured data.



    Reply·

    2

    ·
    Share



    • EB

      EUGENE BEM

      2 hours ago



      comparable to the technologies used by the defense sector to facilitate precision weaponry and target collection. AI-guided autonomous weapons are already in service.



      Reply·

      ·
      Share





  • SS

    Sekar Swaminathan

    4 hours ago

    Researchers attribute this to factors including a scarcity of high-quality, new data for training.


    So much data is available in video, image and audio formats and the entire library of congress has tons of rare documents.

    There are books in our shelves that have notes on the margins that are still not digitized and uploaded into a model yet.

    There are tons of private data lying about in our private disks, storages. What about the old tape drives and even card decks? They are not digitized yet.

    There are tons of data out there still. Saying that scarcity of data had made AGI mature is like Bill Gates observing 640K of RAM ought to be enough for everybody.

    I am waiting for my AI generated doppelganger to attend inane meetings, take notes and make more intelligent observations than I could do alone.

    Storytellers can set the stage and the characters and script their story and AGI generates a complete movie.

    A household AI robot that helps with menial tasks and to take care of my old parents when I am out and about.

    There are many, many creative applications -- some useful, some curious, and many useless -- that could be built from AGI. The future is wide open.

    Reply·

    4

    ·
    Share



    • JM

      Jim McCreary

      1 hour ago



      "High quality" is the limiting factor here. As soon as you move beyond 100% operational-reality data collected by machine sensors in real time, you wind up afloat in a sea of opinion, contradiction, and ambiguity. That's why these LLMs often produce nonsense--we're asking them to summarize and make choices from a seething vat of contradictory and ambiguous opinion-based STUFF which can be called "data", but is larded with wrongness, foolishness, and flat-out lies.

      So the wave of the future for AI is likely to be 100% Fact Based systems, which by the nature of their underlying data, will need to be specialized tools with relatively narrow focus. Wise strategists will embrace that to put together large portfolios of 100% Fact Based AI tools which users can either summon up as needed, or set to auto-suggest themselves when users are doing things that an AI module thinks it can do automatically.

      Of course, for all that to happen, the AI system developers are going to need access to lots of 100% Fact Based data, which means they're going to have to BUY data use rights from the operating companies which generate such things. So when we see Big Infotech finally ponying up to buy exclusive access to 100% Fact Based operating data, we'll know that they're on the right track. That might prove expensive, so BIT's current obsession with big models using "free" garbage-in from the Internet will have to find its dead end first.



      Reply·

      2

      ·
      Share





  • DM

    Dan M

    4 hours ago



    Smaller models need less computing power to train, and way less to use. Time to sell NVDA stocks?



    Reply·

    1

    ·
    Share

    1 replying



    • Gary J. Rhine

      3 hours ago



      Better you than me. Their backlog is still growing and newer smaller, faster chips are still in the announced phase.



      Reply·

      ·
      Share



    • AR

      anthony r

      4 hours ago



      If you take a smaller on site model and fine tune it with locally relevant data it will still require lots of matrix arithmetic. With so many developers trained on CUDA I would say choosing NVIDIA GPUs to do accomplish that is still a likely scenario. Maybe when the big boys finally build an LLM that achieves AGI and there is nothing left to do then it might be a time to sell.



      Reply·

      ·
      Share



      • DR

        Dev Roy

        3 hours ago



        A clarifying point here , very few devlopers are trained on CUDA , the pyhon (usally) libraries invoke CUDA seamlessly example Pytorch. Tomorrow if the libraries invoke any other accelerator we would be indifferent to it. So CUDA while efficient is abstracted away from most developers of AI models. This whole CUDA argument is a Wall Street / Consultant mumbo jumbo. Here is a snippet of code how CUDA is invoked if torch.cuda.is_available():
        dev = "cuda:0"
        else:
        dev = "cpu"
        device = torch.device(dev)



        Reply·

        3

        ·
        Share



        • DM

          Dan M

          2 hours ago



          I develop and concur.



          Reply·

          1

          ·
          Share









Powered by


Terms| Privacy|Feedback



    Videos

    MOST POPULAR NEWS

    MOST POPULAR OPINION

    RECOMMENDED VIDEOS



    WSJ Membership


    Customer Service


    Tools & Features



    More




    Dow Jones Products




    0%
    10%
    20%
    30%
    40%
    50%
    60%
    70%
    80%
    90%
    100%


Report TOU ViolationShare This Post
 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext