Command Palette

Search for a command to run...

Discover

OpenAI and Broadcom Unveil "Jalapeño," a Custom LLM Inference Chip to Reduce Reliance on Nvidia

OpenAI and Broadcom unveiled Jalapeño, OpenAI's first custom AI inference chip, developed in just nine months with the help of OpenAI's own models. The ASIC is designed to cut inference costs by roughly 50% and will be deployed at gigawatt-scale data centers by end of 2026.

OpenAI and Broadcom Unveil "Jalapeño," a Custom LLM Inference Chip to Reduce Reliance on Nvidia
Click to expand

From blank-slate design to silicon in nine months

OpenAI and Broadcom on June 24, 2026, unveiled Jalapeño, OpenAI's first custom-built AI chip — an application-specific integrated circuit designed from the ground up for large language model inferenceopenai +1. The announcement, made eight months after the partnership was publicly disclosed, marks OpenAI's most aggressive push yet to control its own hardware stack and reduce its dependence on Nvidia's GPUscnbc +1. Engineering samples were physically delivered to OpenAI CEO Sam Altman and President Greg Brockman at the event, and early tests show Jalapeño already running production workloads — including GPT-5.3-Codex-Spark — in a lab environmentopenai +1.

OpenAI and Broadcom moved from initial schematics to a completed manufacturing tape-out in just nine months, a pace the companies describe as the fastest ASIC development cycle ever achieved in high-performance advanced semiconductorsopenai +1. The speed was aided by OpenAI's own AI models, which were used to accelerate parts of the chip design and optimization processtechcrunch +1. "The degree to which our models have been able to accelerate it was very surprising to us," Brockman told CNBCcnbc.

Purpose-built for inference, not general compute

Unlike Nvidia's general-purpose GPUs, Jalapeño is an ASIC — narrower in scope but purpose-fit for the specific workloads behind ChatGPT, Codex, and OpenAI's APItechcrunch +1. The architecture minimizes data movement and balances compute, memory, and networking resources so real-world performance runs closer to theoretical peak efficiencyopenai. Early tests indicate performance-per-watt "substantially better than current state-of-the-art," and Bloomberg reported the chip is expected to cut inference costs by roughly 50%venturebeat +1.

Broadcom is contributing silicon implementation and Tomahawk networking technology, while manufacturing partner Celestica handles board, rack, and system integrationopenai +1. The goal is deployment at gigawatt-scale data centers alongside Microsoft and other partners by the end of 2026, with Broadcom CEO Hock Tan projecting a full ramp in the first half of 2028cnbc.

Closing the gap with hyperscaler rivals

The launch narrows a structural gap that has long disadvantaged OpenAI relative to cloud giants like Google and Amazon, which have run proprietary AI silicon — Google's Tensor Processing Units and Amazon's Trainium — to serve AI workloads at lower costventurebeat +1. OpenAI's 2025 financials underscore the urgency: the company posted $13.07 billion in revenue against $34 billion in total operating expenses, with compute costs including more than $10.59 billion paid to Microsoft for cloud infrastructureventurebeat.

Jalapeño does not replace Nvidia in OpenAI's stack. Nvidia finalized a $30 billion investment in OpenAI as part of a $110 billion funding round earlier in 2026, and its hardware will remain central to model trainingventurebeat +1. OpenAI also holds deals with AMD, Cerebras, and AWS for additional computecnbc. What Jalapeño changes is OpenAI's leverage at the inference layer — the point where AI actually reaches users — cutting per-query costs and shoring up a credible path toward profitability ahead of a widely anticipated IPOventurebeat.