SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Technology Stocks : Qualcomm Moderated Thread - please read rules before posting
QCOM 152.21-0.3%Jan 29 3:59 PM EST

 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext  
To: QCOM_HYPE_TRAIN who wrote (197240)1/17/2026 4:22:02 AM
From: Wildbiftek15 Recommendations

Recommended By
AlfaNut
duncan stuart
Eric Martin
JeffreyHF
Jim Mullens

and 10 more members

   of 197427
 
I'd say Qualcomm's push is all in on energy efficiency and inferencing from a higher level.

Unpacking this, their neural compute strategy:

1) Doesn't involve training where nVidia has a large lead and a moat in its CUDA stack --and instead goes after the >15x higher volume of inference hardware.
2) Starts from power efficiency --the strategic push starts from the edge and encroaches into the data center.

Due to this, the high precision data types and the Von Neumann style architecture of GPUs while useful for training still shuttle data back and forth needlessly to and from memory during inference. This will consume vastly more energy during inference than a purpose built accelerator which aligns better with the data flow of pre-trained networks architectures.

nVidia was targeting specifically DLSS as a visual application with these units starting from their Turing series, and for this specific set of visual applications which require interactivity at low latency, it makes a lot of sense to do inferencing within the GPU. For more general purpose inferencing in non-visual applications such as an LLM, the more traditional data flow architecture and mix of precisions in a GPU make it a lot less efficient.

I'm guessing that due to the >3 year cycle times for major GPU design revisions, Eric completed the designs for the next major revision of Adreno with soon to be industry standard tensor units in the mix in preparation for the next-gen graphics APIs. He was probably hoping that the GPUs would lead the charge for Qualcomm's AI push in data centers and potentially training in addition to inference, but the Q pivoted to the more focused NPUs because of their desire for company-wide focus for 1) and 2) above. Still, sad to lose a legend in the GPU space, and I do hope Qualcomm has either a strong internal candidate to fill Eric's role or poaches a senior talent from either a start up or an established GPU house...

Intel, which sees data centers as its stronghold after its loss of the Macbook SoC socket are trying to mount a challenge to nVidia and pushing back out from data centers to reclaim the client space through GPUs which can be used for training and inference. Eric has had 14 years of experience shipping billions of energy efficient GPUs so he's a natural candidate for designing this part for their comeback. That said, the scale of energy required for inference will be beyond what our existing and future energy infrastructure can bear if GPUs alone are the primary thing in the mix, and similarly NPUs can simply do more per watt than GPUs at the edge.

Once the novelty of the current generation of science experiment style LLMs wears off, people will care about things like privacy, the proper scope of applications for correctness, and cost --meaning client side processing as well as smaller, more focused models that do things like prediction in addition to generation will matter more, and so the energy metrics and edge-first approach of Qualcomm's NPUs will matter.
Report TOU ViolationShare This Post
 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext