SI
SI
discoversearch

We've detected that you're using an ad content blocking browser plug-in or feature. Ads provide a critical source of revenue to the continued operation of Silicon Investor.  We ask that you disable ad blocking while on Silicon Investor in the best interests of our community.  If you are not using an ad blocker but are still receiving this message, make sure your browser's tracking protection is set to the 'standard' level.
Technology Stocks : AMD, ARMH, INTC, NVDA -- Ignore unavailable to you. Want to Upgrade?


To: neolib who wrote (71709)10/20/2025 11:19:57 PM
From: Joe NYCRespond to of 72181
 
I think most people think one model being used by 100s of GPUs.

I think the scenario describes multiple (10s) of models running at the same time. And instead of statically allocating a fixed number of GPUs to a certain model (and be very underutilized) instead they put the GPUs into pools, and they get used based on demand.

Nothing more to it, IMO, other than using unused GPU to where it is used or not even having the unused GPU at all.