SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Technology Stocks : ASML Holding NV
ASML 1,075+0.4%Oct 30 3:59 PM EDT

 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext  
From: BeenRetired10/23/2025 7:27:40 AM
1 Recommendation

Recommended By
Tobias Ekman

   of 42235
 
“GSI Gemini-II (PIM) APU will deliver ten times higher throughput”


TechRadar

114.5K Followers

The associative processing unit wants to displace Nvidia's GPU as the go-to AI powerhouse by putting compute in the memory itself

Story by Efosa Udinmwen12h


The associative processing unit wants to displace Nvidia's GPU as the go-to AI powerhouse by putting compute in the memory itself© TechPowerUp

  • GSI Gemini-I APU reduces constant data shuffling between the processor and memory systems
  • Completes retrieval tasks up to 80% faster than comparable CPUs
  • GSI Gemini-II APU will deliver ten times higher throughput

GSI Technology is promoting a new approach to artificial intelligence processing that places computation directly within memory.

A new study by Cornell University draws attention to this design, known as the associative processing unit (APU).

It aims to overcome long-standing performance and efficiency limits, suggesting it could challenge the dominance of the best GPUs currently used in AI tools and data centers.

A new contender in AI hardware
Published in the ACM journal and presented at the recent Micro ’25 conference, the Cornell research evaluated GSI’s Gemini-I APU against leading CPUs and GPUs, including Nvidia’s A6000, using retrieval-augmented generation (RAG) workloads.

The tests spanned datasets from 10 to 200GB, representing realistic AI inference conditions.

A new study by Cornell University draws attention to this design, known as the associative processing unit (APU).

It aims to overcome long-standing performance and efficiency limits, suggesting it could challenge the dominance of the best GPUs currently used in AI tools and data centers.

By performing computation within static RAM, the APU reduces the constant data shuffling between the processor and memory.

This is a key source of energy loss and latency in conventional GPU architectures.

The results showed the APU could achieve GPU-class throughput while consuming far less power.

GSI reported its APU used up to 98% less energy than a standard GPU and completed retrieval tasks up to 80% faster than comparable CPUs.

Such efficiency could make it appealing for edge devices such as drones, IoT systems, and robotics, as well as for defense and aerospace use, where energy and cooling limits are strict.

Despite these findings, it remains unclear whether compute-in-memory technology can scale to the same level of maturity and support enjoyed by the best GPU platforms.

GPUs currently benefit from well-developed software ecosystems that allow seamless integration with major AI tools.

For compute-in-memory devices, optimization and programming remain emerging areas that could slow broader adoption, especially in large data center operations.

GSI Technology says it is continuing to refine its hardware, with the Gemini-II generation expected to deliver ten times higher throughput and lower latency.

Another design, named Plato, is in development to further extend compute performance for embedded edge systems.

“Cornell’s independent validation confirms what we’ve long believed, compute-in-memory has the potential to disrupt the $100 billion AI inference market,” said Lee-Lean Shu, Chairman and Chief Executive Officer of GSI Technology.

“The APU delivers GPU-class performance at a fraction of the energy cost, thanks to its highly efficient memory-centric architecture. Our recently released second-generation APU silicon, Gemini-II, can deliver roughly 10x faster throughput and even lower latency for memory-intensive AI workloads.”

Via TechPowerUp

Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews, and opinion in your feeds. Make sure to click the Follow button!

And of course you can also follow TechRadar on TikTok for news, reviews, unboxings in video form, and get regular updates from us on WhatsApp too.

Like this article? For more stories like this, follow us on MSN by clicking the +Follow button at the top of this page.

PS
After much Copilot querying, I'm convinced the Logic n Memory n Storage Shrink n Stack bit bonanza JUST started.
JMNSHO.

ASML
Village
Report TOU ViolationShare This Post
 Public ReplyPrvt ReplyMark as Last ReadFilePrevious 10Next 10PreviousNext