OpenAI is moving deeper into the hardware business.
The company has unveiled JalapeƱo, a custom AI inference chip developed with Broadcom and designed to run large language models more efficiently after they have already been trained. The move could help OpenAI reduce the cost of serving products such as ChatGPT and Codex, while giving it more control over the infrastructure behind its AI systems.
Here are five things you should know about this new chip.
1. What JalapeƱo is designed to do
JalapeƱo is not a general-purpose processor. It is an ASIC (application-specific integrated circuit) built specifically for LLM inference, the process of running trained AI models to respond to user prompts in real time.
According to OpenAI, the design was built from scratch around its internal understanding of model behavior, including kernels, memory movement, and serving systems. Unlike training-focused chips such as Nvidia GPUs, JalapeƱo is designed to make AI responses faster, cheaper, and more efficient at scale.
2. A nine-month sprint powered by AI
One of the most notable aspects of the project is its development timeline. OpenAI says JalapeƱo went from design to manufacturing tape-out in just nine months, which it believes may be one of the fastest advanced chip development cycles on record.
The company also says its own AI models helped accelerate parts of the design process, essentially using AI to build the hardware that will later run AI systems. Broadcom provided critical silicon implementation and networking support, including its Tomahawk networking technology, while Celestica contributed to board and system integration.
3. Why OpenAI is doing this now
The move underscores a broader industry evolution: major AI companies want to reduce their dependence on external chip suppliers, especially Nvidia.
OpenAI president Greg Brockman described the effort as part of a āfull-stackā strategy to improve efficiency and reduce cost, arguing that controlling more layers of infrastructure allows better optimization across the entire system.
The motivation is also economic. Inference, not training, is where AI products interact with users every day, meaning even small efficiency gains can translate into large cost savings at scale. This could improve margins for services like ChatGPT and future agent-style products, especially as demand for AI compute continues to surge.
4. Industry implications and competitive pressure
OpenAI is not alone in this direction. Companies like Google and Amazon have already built their own AI chips, and Microsoft has its own accelerator efforts.
But JalapeƱo signals OpenAIās intention to join that club more directly, while still relying on partners like Nvidia for training workloads. The AI industry is fragmenting its hardware stack, with major players racing to control both software and silicon. Broadcom, meanwhile, is emerging as a key behind-the-scenes winner, supplying the networking and chip-building expertise that powers many of these custom platforms.
5. Early tests show big efficiency gains, but no final numbers yet
OpenAI says early internal testing suggests JalapeƱo delivers significantly better performance per watt than current state-of-the-art chips. However, the company has not yet released a full technical report, and final benchmarks are still pending.
Reports indicate that the design reduces data movement and improves the balance of compute, memory, and networking, a key bottleneck in AI inference workloads. Industry observers note that while promising, these claims are still early and will need independent validation once deployed at scale.
Why this matters to the market
For consumers and businesses, this isnāt just an engineering milestone; itās an economic necessity.
Running modern AI products is expensive. When a user asks ChatGPT a multi-step coding question or interacts with a continuous digital agent, data centers burn massive amounts of electricity. If JalapeƱo can dramatically slash the power required to answer those queries, OpenAI can lower its massive operational overhead.
For everyday users, that translates directly to faster response times, more capable free tools, and cheaper API access for developers building AI applications.
However, the move comes with major trade-offs. ASICs are notoriously rigid; if AI architecture shifts radically in the next two years, a highly specialized chip like JalapeƱo risks becoming obsolete. Furthermore, this chip only handles inference, not the computationally massive training process required to build a model like GPT-5. OpenAI is still entirely beholden to Nvidia for training hardware.
Ultimately, this is a long-term infrastructure play. While Broadcom CEO Hock Tan noted that initial āsmall prototype developmentā will drop in late 2026, the real volume isnāt expected to go āfull tiltā until the first half of 2028. OpenAI has a massive hill to climb to reach its goal of 10 gigawatts of compute by 2029.
JalapeƱo wonāt kill Nvidiaās monopoly tomorrow, but it gives OpenAI a vital shield against surging hardware costs and a powerful chip to play at the negotiating table.
For more on OpenAIās broader business strategy, read our coverage of the companyās confidential SEC filing that could pave the way for a future IPO.
Read the full article here