AWS re:Invent – Custom Silicon Continues to Mature

November 28, 2023 / Ben Bajarin

At Amazon’s yearly event, highlighting all the ways AWS is continuing to innovate around cloud platforms and infrastructure, they unveiled updates to two key parts of their custom silicon stack. Graviton 4 and Tranium 2.

Key Takeaways

  • On it’s fourth generation, Graviton continues to offer increased value in performance and price efficiency
  • Graviton 4 moves to 128 cores, which appears the baseline in Arm CPU cloud native designs to be competitive against competing roadmaps
  • Tranium 2 continues to gain traction with new foundational customers standardizing on training their models on the new custom ASIC


What’s Significant

From a strategy perspective, Amazon’s custom silicon approach is the most mature of the top three cloud providers. AWS CEO Adam Seplinksy went out of his way to mention the maturity of this roadmap compared to competitors (Azure) who haven’t shipped their V1 custom silicon yet.  The most mature is Graviton, which now in its fourth generation is up to 128 cores and still maximing performance per watt giving customers the most significant energy cost savings of any AWS instance.

From a customer panel we attended, two different customers using Graviton instances stated the cost/energy savings was the biggest benefit they saw as they moved from x86 to Arm based Graviton instances. This is consistent with other customer testimonials we have heard. Both customers cited the lengthy transition process but once they made the jump they don’t see themselves going back due to the cost savings.

While Amazon can see Graviton moving to HPC applications, we still believe it is best suited for cloud native workloads and we will watch how the Graviton roadmap evolves as both Intel and AMD are aggressively pursuing cloud native SKUs in their roadmap as well to shrink the gap in performance per watt against Arm based cloud native silicon.

Trainium 2, on paper, sounds like a significant upgrade in terms of training capabilities. This new of the chip was followed by Anthropic’s CEO saying they will be using Tranium 2 to train future models of Claude, which will be up in the trillions of parameters.  Assuming the time to train/bring to market is competitive with Nvidia this will be a solid customer proof point for AWS with Tranium.  We still have questions as to how competitive Tranium is vs. Nvidia GPUs and at the end of the day having the option for the customer to choose is Amazon’s main value proposition.  Having Trainium 2 as an option both price competitiveness and availability, since there are wait times to use Nvidia clusters, helps make Tranium’s positioning more clear.  Again, the question remains about the Tranium roadmap vs. merchant silicon roadmaps and what more Amazon can do to more tightly integrate and optimize its AWS software stack to its custom silicon roadmap and lead to more sustainable differentiators over merchant silicon efforts. That said, as we see more organizations/enterprises just want to train and fine tune their own domain specific models, these customers don’t need the most powerful GPUs and in these cases, which are numerous, Tranium should more than suffice.

Announcement Details Summary

  • AWS unveiled next generation Graviton4 and Trainium2 chips delivering major price-performance and energy efficiency improvements
  • Graviton4 provides up to 30% better compute performance, 50% more cores, and 75% more memory bandwidth than Graviton3
  • Trainium2 delivers up to 4x faster training performance than first generation Trainium and improves energy efficiency up to 2x
  • Trainium2 will enable training foundation models with hundreds of billions to trillions of parameters significantly faster and cheaper
  • New EC2 instances powered by Graviton4 (R8g) and Trainium2 (Trn2) announced to leverage capabilities of new chips
  • Major AWS customers and partners including Anthropic, Databricks, Datadog, Epic, Honeycomb, and SAP plan to use the new AWS-designed chips

Join the newsletter and stay up to date

Trusted by 80% of the top 10 Fortune 500 technology companies