OpenAI and Broadcom Partner to Develop Custom AI Inference Chip

OpenAI and Broadcom have announced the development of “Jalapeño,” a custom AI inference processor designed to optimize performance per watt for large language models. The accelerator, built in a nine-month development cycle, is intended to power gigawatt-scale data centers alongside partners including Microsoft, with deployment expected to begin in 2026. According to OpenAI leadership, the chip is architected specifically for LLM inference rather than general-purpose computing.

Why Is OpenAI Building Its Own Silicon?

OpenAI is moving toward a “full-stack” infrastructure strategy to gain control over the hardware running its models. By designing chips, kernels, and networking systems, the company aims to reduce the cost and latency associated with serving AI to users. According to OpenAI President Greg Brockman, this approach is intended to make intelligence “more abundant” and affordable. Unlike general-purpose GPUs, Jalapeño is a “blank-slate” design, built specifically to handle the memory movement and networking patterns required by frontier LLMs.

Pro Tip: Look for companies to increasingly shift toward “vertical integration.” By controlling the entire stack—from the model architecture down to the silicon—firms can optimize for specific workloads, effectively squeezing more efficiency out of every watt of electricity.

How Does Jalapeño Compare to Existing Accelerators?

Early laboratory testing indicates that Jalapeño will offer performance per watt that is “substantially better” than current state-of-the-art accelerators, according to OpenAI. While final benchmarks have not been released, the architecture is designed to minimize data movement, which is often a primary bottleneck in high-performance computing. By balancing compute, memory, and networking resources, the chip aims to achieve realized utilization closer to its theoretical peak than general-purpose alternatives.

Feature	Standard GPU	Jalapeño (Inference Processor)
Design Focus	General-purpose AI/Graphics	Specialized LLM Inference
Target Outcome	Versatility	Performance per Watt
Timeline	Multi-year cycles	9-month development cycle

What Role Did AI Play in the Chip’s Development?

The nine-month development cycle for Jalapeño is the fastest ASIC development cycle in high-performance semiconductors to date, according to OpenAI. The company utilized its own AI models to assist in the design and optimization process. This represents a shift in hardware engineering: using software to design the very infrastructure that will eventually host more advanced software. Broadcom provided the silicon implementation and networking technology, specifically utilizing its Tomahawk networking silicon to support the large-scale production requirements.

OpenAI and Broadcom announce multi-year partnership to develop custom chips

What Happens to Data Centers in 2026?

The collaboration between OpenAI, Broadcom, and system integrator Celestica is focused on the transition to “gigawatt-scale” data centers. These facilities require specialized hardware to manage the massive power and thermal loads associated with modern AI. According to Broadcom CEO Hock Tan, the partnership is part of a “multi-generation roadmap” that will see the deployment of these custom accelerators starting in 2026. This infrastructure is intended to support not just OpenAI’s internal needs, but the requirements of its broader partner ecosystem, including Microsoft.

Did you know? A “gigawatt-scale” data center is an immense facility capable of powering approximately 750,000 homes. Scaling to this level requires custom networking silicon to prevent data bottlenecks between thousands of interconnected AI chips.

Frequently Asked Questions

Will Jalapeño replace NVIDIA GPUs?
Jalapeño is designed as a specialized inference accelerator. OpenAI’s strategy focuses on building its own custom silicon for specific LLM workloads, though it will continue to operate within a broader compute ecosystem.
When will these chips be available?
Engineering samples are currently running in labs. Broadcom and OpenAI have stated that large-scale deployment is expected to begin by the end of 2026.
Why is “performance per watt” so important?
As AI models grow in size, the electricity required to run them becomes a significant operational expense and a physical constraint on data center capacity. Higher performance per watt allows companies to deliver more intelligence without a linear increase in power consumption.

Are you interested in how hardware shifts will change the economics of AI? Subscribe to our newsletter for deep dives into the infrastructure powering the next generation of LLMs.

OpenAI and Broadcom Partner to Develop Custom AI Inference Chip

Why Is OpenAI Building Its Own Silicon?

How Does Jalapeño Compare to Existing Accelerators?

What Role Did AI Play in the Chip’s Development?

What Happens to Data Centers in 2026?

Frequently Asked Questions

Share this:

Related

It Is a Threat

US and Russia Oppose UN HIV/AIDS Declaration

You may also like

Leave a Comment Cancel Reply