Comment on NVIDIA GPU Versus AMD GPU
From a poster on Seeking Alpha, grxbstrd, who is a long-time NVIDIA investor and who is very knowledgeable:
"They convinced me to invest in AMD to diversify my NVIDIA holdings."
Okay, sorry to hear. thanks for the honest reply. Here are some thoughts on your initial post:
To be clear, Aldeberen was built to provide the heavy lifting grunt in the world's (or perhaps U.S.'s) first "exascale" class supercomputer, a billion billion operations per second. Supercomputers have traditionally been ranked by their 64-bit (double precision) floating point performance.
Nvidia deprioritized 64bit in A100, their view was the computation requirement for AI is lower precision, 8-bit or even 4-bit precision in some cases. So they made sure 32-bit and lower precision was their focus and it performs well, especially with their tensor core technology.
AMD on the other had was given a requirement for double precision, they had to have peak performance in the highest precision format in order to win the crown. What was the cost? They paid a huge die-area penalty to get there. *And* AMD uses two chips in one package to get the job done -- in reality all these comparisons are using two chips vs a single A100 (AMD's message is device to device). Finally, to my knowledge this part does not contain tensor core technology, a matrix multiply and add function that is very beneficial to AI. So let's just conclude AMD targeted this part to steer clear of Nvidia's domain, AI.
But beyond that nitty gritty, I want to bring the conversation back around to your comment about "strong competition."
Is a school bus a "strong competitor" for a drag racer? Depends on what you're measuring. One carries one passenger really fast (and dangerously), the other carries a lot of passengers slowly and safely. My point is these two products really serve different markets. Aldebaran is looking to compete in a market A100 never targeted.
Is Aldebaran faster than A100 in FP64? Of course it is, 64Bit performance was what they set out to slay in a decades long dream of exescale. AMD gets bragging rights.
I think the meaningful question is: Do those bragging rights translate to meaningful business opportunity?
I’m fairly convinced AMD will struggle with traction outside their government contracts with this GPU. In fact, Lisa said she doesn’t expect any Data Center GPU traction until 23 or later (perhaps infinity?). Beyond these DoD/DoE systems, AMD has a tiny amount of software support for HPC and that's detrimental to their effort. I'd predict that's what articles spotlight in a negative way when more thorough reviews/discussion take place.
Last point I want to address is the AMD fans. They can be convincing, for sure. But AMD has a limited future. Why? Because AMD's business is taking market share from others, not creating it out of thin air like NVDA does.
If you want to hop back and forth between these two investment locomotives be my guest. But at some point Intel will get their act together. NVDA on the other hand is AAA investment quality, leading technology in this new universe of dense computational opportunity, and they have developed the tools to harness it. Their ideas are being applied to many many different greenfields in technology (AI, Robotics, Self driving, Omniverse etc). A few years ago those markets basically didn't exist, GPUs and nvidia's foresight and persistence exploded them into existence. Interestingly AMD led NVIDIA in GPUs years ago, now they are basically marginalized in the sector. That is because of deep thinking and strategic investing by Nvidia that AMD couldn't match. The potential is many many times the size of PC silicon market. AMD has demonstrated no leadership ability to understand or target similar opportunities. And I'd invite any of the AMD fans to come over and debate it.
Cheers, Frank |