Beyond the First Wave: The New Architecture of AI Infrastructure
While the initial surge of the AI boom was defined by raw processing power, the industry is shifting toward a more complex infrastructure. The focus is moving from how swift a single chip can operate to how efficiently massive clusters of chips can collaborate.
This transition is creating a “supercycle” where the winners are no longer just the primary chip designers, but the companies that enable connectivity, customization and specialized inference.
Broadcom and the Networking Backbone
Networking is becoming the unsung hero of AI infrastructure. As large language model (LLM) training and inference demand more speed, the ability to transfer data quickly between servers is paramount.
Broadcom has positioned itself as a leader in data center networking through its Tomahawk Ethernet solution. This technology serves as a gold standard for switching and forms the backbone for many hyperscalers’ data centers.
The Pivot to Custom AI Accelerators
Beyond networking, the trend toward application-specific integrated circuit (ASIC) technology is accelerating. Companies are increasingly seeking custom silicon tailored to their specific workloads rather than relying on general-purpose chips.
Broadcom is at the forefront of this movement, providing the intellectual property and expertise to turn customer designs into scalable physical chips. Key examples include:
- Alphabet: Broadcom helped develop Tensor Processing Units (TPUs), which are now utilized by Alphabet and outside customers like Anthropic.
- Meta Platforms and OpenAI: Both companies have deals with Broadcom to design their own custom chips.
The scale of this opportunity is massive, with Broadcom projecting $100 billion in custom chip sales alone by fiscal 2027.
AMD: Dominating Inference and Agentic AI
The next phase of AI evolution is centered on inference and agentic AI. While training builds the model, inference is where the model is actually put to work to generate results.
AMD is tackling this with its upcoming MI450 graphics processing unit (GPU). By utilizing a chiplet design, AMD can integrate significantly more memory, which reduces costs and boosts inference performance.
The Critical Role of the CPU in Agentic AI
Agentic AI—AI that can act as an autonomous agent—requires a different hardware balance than traditional LLMs. These systems need a much higher CPU-to-GPU ratio because CPUs handle the sequential logic and real-world interactions necessary to manage AI agents.

As a leader in data center central processing units (CPUs), AMD is developing high-performance CPUs specifically designed to meet these agentic demands.
Building End-to-End Systems
AMD is moving from being a component supplier to a full-system provider. Through the acquisition of ZT Systems, the company can now provide end-to-end full rack systems optimized for inference and agentic AI.
This strategic shift is supported by massive industry commitment. Both Meta and OpenAI have committed to 6 gigawatts’ worth of AMD’s new GPUs and have taken stakes in the company through warrants to integrate the ROCm software platform into their workflows.
Comparison of AI Infrastructure Leaders
| Feature | Broadcom (AVGO) | AMD (AMD) |
|---|---|---|
| Core AI Strength | Networking & Custom ASICs | Inference & Agentic AI CPUs/GPUs |
| Key Technology | Tomahawk Ethernet | MI450 GPU & ROCm Software |
| Major Partners | Alphabet, Meta, OpenAI | Meta, OpenAI |
Frequently Asked Questions
What is the difference between AI training and inference?
Training is the process of creating an AI model using massive datasets. Inference is the process of the trained model applying that knowledge to answer a prompt or perform a task.
Why is Broadcom important for AI if they don’t craft primary GPUs?
Broadcom provides the networking (like Tomahawk Ethernet) that allows thousands of GPUs to work together as one cluster and helps companies like Alphabet and Meta design their own custom AI chips (ASICs).
What is “Agentic AI” and why does it need CPUs?
Agentic AI refers to systems that can perform complex, multi-step tasks autonomously. These require sequential logic and interaction management, which are primary strengths of CPUs rather than GPUs.
Want to stay ahead of the AI supercycle?
Share your thoughts in the comments below: Do you believe custom silicon will eventually replace general-purpose AI chips? Subscribe to our newsletter for more deep dives into the future of technology infrastructure.
