Understanding Hyperscaler Custom ASIC Strategy

February 11, 2025 / Ben Bajarin

The GPU vs. Custom ASIC debate is one of the most talked about in our circles of tech executives and investors. It is also a debate that requires more in-depth analysis to fully understand. I want to look at the custom ASIC strategy in two ways relative to this debate. The first is how custom ASICs are primarily used today for first-party workloads. When I say first-party workloads, I mean both AI and non-AI workloads. Second, and this is a key part of the debate, I want to look at the opportunity for custom ASICs to move beyond first-party workloads and be adopted by third-party enterprises/software developers.

Custom ASICs and First-Party Workloads
The showcase example in custom ASICs for first-party workloads is Google. Google is now six generations into their cloud TPU architecture and has the largest custom ASIC installed base of any hyperscaler. The vast majority of Google’s internal software, both non-AI and AI, runs on TPUs, and most of their GPU installed base is for third-party developers on GCP. Using our accelerated compute installed base model, it is interesting to note that Google has the smallest number of GPUs available at GCP compared to AWS and Azure; however, when you look at the total accelerated compute infrastructure that includes GPUs and TPUs, Google has the highest share of accelerated compute silicon at ~35%. Noting the significant volume of TPUs over GPUs shows Google’s ability to foresee designing silicon for their own internal workloads and thus has the least dependence (some would argue no dependence) on Nvidia for their own software.

AWS has been in the custom silicon game for a while as well, but much longer with Graviton, which is their Arm-based CPU. Still primarily for internal workloads (Amazon.com), but now with multiple generations of Trainium and Inferentia, Amazon has the custom accelerated infrastructure they need primarily for their own internal non-AI and AI workloads. Again, looking at our accelerated compute installed base model, Amazon is split roughly 50-50 between GPU capacity and custom ASIC capacity, and is just below Google with total accelerated compute capacity at ~30%.

Google and Amazon have been very vocal when they talk about their custom silicon strategy, that the primary benefits they see are better performance-per-watt, TCO efficiencies, and more. Below is a visual of the primary benefits consistently highlighted with custom ASICs.

Understanding that these custom ASICs’ strategic goal has, for now, only been to optimize internal workloads across the efficiency metrics charted above is foundational. While we can argue that part of their strategy was to reduce reliance on Nvidia, I do not think this was the primary nor originating goal. Specific-purpose workloads, like those internal to Google and Amazon, were prime candidates for specific-purpose silicon. Going deeper on validating this, it is interesting to note that Microsoft has the largest GPU installed base of all three public cloud hyperscalers. In context, this makes sense given that Microsoft has largely been running AI workloads for OpenAI, which runs on Nvidia GPUs. Microsoft has recently developed Cobalt, their custom Arm CPU, and Maia, their custom ASIC, and has begun running internal workloads on that silicon. However, noting that OpenAI workloads are a foundational part of Microsoft’s strategy (for now) makes their ownership of the largest share of GPUs and the smallest share of custom ASICs a key observation.

The Role of GPUs
With the above foundation laid, the question now lies in the role of the GPU. It is safe to assume that, directionally, Microsoft, Google, and Amazon will continue to optimize their internal workloads for their custom ASICs and thus decrease their reliance on Nvidia GPUs for their internal workloads only. The question, then, is whether there is an opportunity for those custom ASICs in third-party workloads as well. Right now, we have little to no definitive evidence to support that theory. The vast majority of third-party AI workloads are still being run on Nvidia GPUs, and there is no sign of that slowing down. There are a few arguments against custom ASICs for third parties.

The first is architectural compatibility. This is a staple argument from Jensen Huang when he addresses the GPU vs. custom ASIC question. Developers want to know their workloads have some form of compatibility across platforms, and Nvidia GPUs provide that architectural compatibility. Much of that has to do with CUDA and its standardization for most AI/HPC software developers. If a company decides to do the lift and shift with their AI workloads to run on something like Google’s TPU, they will have to do all that software optimization themselves to be optimized specifically for the TPU, but as a result, will lose architectural and workload compatibility across clouds/compute architectures.

Another argument Jensen Huang likes to use is the massive GPU installed base. If you look at the installed base of Nvidia GPUs in both the datacenter and client PCs, it is roughly ~3M, with about half of that coming from datacenter/cloud installed base. There is no single other high-performance computing architecture/accelerator that comes close to the Nvidia GPU installed base. These two points together make the strongest case for Nvidia GPUs for third-party software developers as well as third-party enterprises looking to run their AI workloads in the cloud.

This point is proven even further in a recent AI workload survey by UBS that asked CTO/CIO decision makers where they intend to spend their money on inferencing their AI workloads. Approximately 64% said Nvidia GPUs will be their primary choice for inferencing. That number was even higher for training, obviously.

There is one other technical advantage to GPUs I want to point out. What is distinctly unique about GPUs architectures is their programability. This allows them to accomplish two critical things computationally. First, they can run standardized and general purpose workloads thus being a standard for workflows. Second, they can also be specific purpose, and in this case be programmed to do all the things custom ASICs can do, and more. This unique technical differentiator is another reason to believe GPUs, Nvidia GPUs, will remain the dominant platform for third party software.

My current thesis, which will adapt as new information comes to light, is precisely as I have laid out. Custom ASICs will remain primarily for first-party hyperscaler workloads. There is no question that the custom ASIC market will continue to grow as hyperscalers keep investing and scaling up their custom silicon strategy for their own internal workloads. I am skeptical they will scale those ASICs out to third parties, which means there is an upper limit to the custom ASIC market in terms of dollar upside for companies like Broadcom, Marvell, and TSMC/Intel.

If the third-party software market continues to favor Nvidia GPUs, then the majority of accelerated compute dollars through 2030 will remain with Nvidia.

In terms of what to watch for, I made a visual on a few of the key market dynamics to watch that can swing this pendulum one way or another.

Join the newsletter and stay up to date

Understanding Hyperscaler Custom ASIC Strategy

February 11, 2025 / Ben Bajarin

Join the newsletter and stay up to date

Trusted by 80% of the top 10 Fortune 500 technology companies