Dark background with a blue-green gradient swirl. Hashrate Index logo in top left. Text in middle: "AI ASICs Explained"

What Is an AI ASIC? A Complete Guide to AI Chips and How They Compare to GPUs

The complete guide to AI ASICs: what they are, who makes them, and why they're eating the GPU market.

Ian Philpot
Ian Philpot

The AI hardware market is fracturing. In 2026, hyperscaler AI ASICs are projected to grow 44.6% while GPUs grow 16.1%, according to TrendForce. Bloomberg Intelligence projects the custom AI ASIC market reaching $118 billion by 2033 at a 27% compound annual growth rate — nearly double the pace of the broader AI accelerator market. And in December 2025, NVIDIA spent $20 billion licensing Groq's inference chip technology, the largest deal in company history. The market has made up its mind: specialized silicon for AI workloads is no longer a fringe bet.

For Bitcoin miners, ASICs are familiar territory. Application-specific integrated circuits have powered Bitcoin mining since January 2013, and the industry's entire economic model rests on the idea that purpose-built silicon beats general-purpose compute. That same thesis is now reshaping AI infrastructure.

Table of Contents

  1. What is an ASIC?
  2. What is an AI ASIC?
  3. ASIC vs. GPU for AI: The Tradeoff
  4. Training vs. Inference: Why AI ASICs Are Having a Moment
  5. Who Makes AI ASICs? The Two Camps
  6. The AI ASIC Market: How Big, How Fast
  7. What This Means for Bitcoin Miners
  8. Latest Developments in AI ASICs
  9. Frequently Asked Questions

What Is an ASIC?

ASIC stands for application-specific integrated circuit — silicon designed to perform one workload, not general-purpose compute. The tradeoff is fundamental: ASICs deliver massive efficiency gains for their target workload and near-zero flexibility for anything else. A CPU can run a spreadsheet, a video game, and a neural network. An ASIC can run one thing, and run it exceptionally well.

Bitcoin mining is the most commercially successful ASIC application in computing history. On January 19, 2013, Canaan Creative launched the Avalon1, the first consumer Bitcoin ASIC miner, and the "ASIC era" of Bitcoin mining began. The industry had already cycled through CPUs (2009), GPUs (2010), and FPGAs (2011), and each transition had brought order-of-magnitude efficiency gains for SHA-256 hashing. ASICs were the logical endpoint — silicon that did nothing but mine Bitcoin, and did it billions of times more efficiently than a general-purpose processor.

The chips themselves have evolved dramatically. Bitcoin mining ASICs started at 130nm process nodes in 2013 and have shrunk to 3nm today — a roughly 100-billion-fold speed increase over 2009-era CPU mining. Manufacturers like Bitmain, MicroBT, and Canaan have turned mining ASICs into a multi-billion-dollar industry, and Hashrate Index's own ASIC price index tracks the secondary market for these chips daily.

The point is this: ASICs aren't new. The concept is battle-tested, the economics are well understood, and the industrial playbook for designing, manufacturing, and deploying them at scale is mature. What's new is their application to AI workloads — a completely different math problem, but the same underlying custom-silicon thesis. If you understand why an Antminer beats a general-purpose server at SHA-256 hashing, you already understand why a Google TPU beats an NVIDIA GPU at transformer inference.

What Is an AI ASIC?

An AI ASIC is a chip designed specifically to run artificial intelligence workloads — most commonly neural network inference, training, or both — rather than general-purpose compute or graphics rendering. Where a Bitcoin mining ASIC performs one operation (SHA-256 hashing) trillions of times per second, an AI ASIC performs matrix multiplications and tensor operations optimized for the specific mathematical patterns of modern neural networks. Same custom-silicon principle, completely different math.

AI ASICs come in two broad flavors worth naming. Training ASICs are optimized for the compute-intensive process of building a model from raw data — examples include Google's TPU v7 Ironwood and AWS Trainium2. Inference ASICs are optimized for running a trained model to generate outputs — examples include AWS Inferentia2, Groq's LPU, and Etched's Sohu. Some chips do both. Some hyperscalers split the work across a dedicated two-chip strategy (AWS being the clearest example with Trainium and Inferentia).

AI ASICs differ from GPUs in a way that maps directly onto the Bitcoin mining ASIC vs. CPU/GPU transition. GPUs are flexible parallel processors originally designed for graphics rendering, later adapted for machine learning because their parallelism happens to suit matrix math. AI ASICs strip away all the flexibility that makes GPUs general-purpose — branch predictors, reorder buffers, caches, graphics pipelines — and replace it with silicon dedicated entirely to specific AI operations. The result is a chip that can't play a video game, can't run a database, and can't render a 3D scene, but can run transformer inference at 10x the throughput of a flagship GPU at a fraction of the power draw.

One confusing thing about the AI ASIC landscape: vendors don't all use the same acronyms. Google calls its chip a TPU (Tensor Processing Unit). Groq calls its chip an LPU (Language Processing Unit). Intel and Qualcomm use NPU (Neural Processing Unit). Meta uses MTIA (Meta Training and Inference Accelerator). Broadcom refers to custom hyperscaler designs as XPUs. All of these are AI ASICs. The terminology is fragmented by design — each vendor wants to own its own acronym — but the underlying category is the same.

ASIC vs. GPU for AI: The Tradeoff

The question readers ask most often is some version of: if AI ASICs are so efficient, why does anyone still buy GPUs? The answer is that AI ASIC vs GPU isn't a question of which chip is "better" — it's a question of what workload the chip is running, at what volume, and under what economic constraints. Here's the core tradeoff:

Dimension

GPU (H100 / Blackwell)

AI ASIC (TPU, Trainium, LPU)

Flexibility

High — runs any workload

Low — optimized for specific tasks

Upfront cost per chip

~$30K–$40K

Design costs: $10M–$100M+

Inference cost per token

Higher

50–67% lower at scale

Power efficiency

Baseline

3–8x better (varies by design)

Software ecosystem

CUDA — mature, vast

JAX, Neuron SDK, proprietary

Time to market

Available now

2–4 year development cycle

Best for

Research, training, flexibility

High-volume production inference

GPUs win on flexibility and ecosystem maturity. CUDA is 15+ years old, it has a vast library of optimized software, and it runs any AI workload you throw at it. This is why GPUs dominate research and early-stage training: when you don't know exactly what you're going to run, flexibility is worth paying for.

AI ASICs win on per-token economics and power efficiency at scale. Once a workload is mature, predictable, and running at hyperscale volume, the design-cost premium of a custom chip pays back quickly. A 50–67% reduction in inference cost per token translates to hundreds of millions of dollars in annual savings at the volumes that Google, AWS, and Meta run their AI workloads.

The decision isn't ideological, and it isn't either/or. Hyperscalers run heterogeneous fleets: GPUs for training and new workloads, ASICs for mature production inference. OpenAI is simultaneously building custom inference ASICs with Broadcom and signing $100B+ GPU deals with NVIDIA — because the workload mix demands both. The right framing isn't "ASIC vs GPU" but "which workload belongs on which chip."

One clarification worth making: the phrase "ASIC vs GPU" appears in two contexts. In the Bitcoin mining world, it refers to SHA-256 mining hardware (ASICs) versus graphics cards that can also be used for mining certain other algorithms (GPUs). In the AI world, it refers to custom AI silicon (ASICs) versus general-purpose parallel processors (GPUs). The underlying economic logic is the same — custom silicon beats general-purpose silicon for well-defined, high-volume workloads — but the specific chips, the specific math, and the specific vendors are entirely different.

Training vs. Inference: Why AI ASICs Are Having a Moment

The structural reason AI ASICs are growing faster than GPUs in 2026 comes down to a shift in where AI money is actually being spent. Training a model happens once. Inference happens every time a user submits a prompt. At companies with deployed AI products, inference costs now massively exceed training costs — OpenAI's internal 2024 figures showed inference running 15–118x more expensive than training depending on the model.

This inversion changes the hardware economics completely. Training is a one-time capital event where flexibility matters: you don't know exactly what you'll train next, so flexible GPUs are the right call. Inference is a continuous operational cost that scales linearly with every user prompt: the more successful your product, the more inference you run, and the more every percentage point of cost-per-token efficiency compounds.

At hyperscale, a 50–67% cost reduction per token justifies the $10M–$100M+ design cost of a custom ASIC many times over within a single year of deployment. This is why Google has been iterating on TPUs since 2013, AWS has been shipping Trainium and Inferentia since 2018, and Microsoft, Meta, and OpenAI are all now building their own inference chips. The math forces the decision.

At NVIDIA's GTC 2026 conference in San Jose, OpenAI's Sachin Katti framed this shift explicitly: AI infrastructure is moving from a training-era footprint to an inference-era footprint, with downstream implications for hardware mix, cooling design, networking topology, and storage hierarchy. Industry practitioners at the Pacific Telecommunications Council (PTC) 2026 conference echoed the point from a different angle — noting that AI hardware value tails are likely longer than current hyperscaler depreciation schedules suggest, meaning specialized inference silicon will have years of deployment runway rather than being obsoleted by the next GPU generation.

The bottom line: the AI ASIC market's 44.6% projected growth in 2026 isn't speculative enthusiasm. It's the predictable consequence of inference becoming the dominant AI workload — and inference is the workload where specialized silicon wins most decisively.

Who Makes AI ASICs? The Two Camps

The AI ASIC market divides cleanly into two camps: hyperscalers building chips for their own clouds, and independents building chips to sell commercially (or to enable hyperscalers in building their own). Each company below gets a deeper treatment in the two-part AI ASIC Market Report that follows this piece — this section is the roster.

Hyperscaler AI ASIC Makers

Google HQ: Mountain View, California, USA Specialty: Training + inference TPUs (Tensor Processing Unit) About: The longest-running hyperscaler ASIC program, now in its 7th generation (TPU v7 Ironwood), with a landmark $10B+ Anthropic deal securing up to 1 million TPUs.

Amazon Web Services HQ: Seattle, Washington, USA Specialty: Separate training (Trainium2) and inference (Inferentia2) chips About: Two-chip strategy mirroring the training/inference economic split, with 500,000+ Trainium2 chips deployed in production as of late 2025.

Microsoft HQ: Redmond, Washington, USA Specialty: Azure AI accelerators (Maia 100, Maia 200) About: Six years behind Google on custom silicon; still runs ~70% of Azure AI workloads on NVIDIA, but the OpenAI workload volume justifies the internal investment.

Meta HQ: Menlo Park, California, USA Specialty: Recommendation and ranking ASICs (MTIA — Meta Training and Inference Accelerator) About: Uncommercialized internal chip optimized for ad ranking and feed recommendation — a narrow but massive-volume workload.

OpenAI HQ: San Francisco, California, USA Specialty: Custom inference ASIC (in development with Broadcom) About: The clearest signal that hyperscaler custom silicon is no longer limited to the Big Four clouds — OpenAI's first chip is targeting Q3 2026 at the earliest.

Independent AI ASIC Makers — Design Enablers

Broadcom HQ: Palo Alto, California, USA Specialty: Custom ASIC design services + AI networking silicon About: The design partner of record for Google TPUs, Meta MTIA, and OpenAI's custom chip — commanding 60–80% of the custom AI ASIC design services market per Bloomberg Intelligence.

Marvell Technology HQ: Santa Clara, California, USA Specialty: Custom ASIC design services + networking silicon About: Broadcom's primary competitor in design services, with 20–25% market share anchored by AWS and Microsoft design wins.

Independent AI ASIC Makers — Direct Competitors

Groq (acquired by NVIDIA, December 2025) HQ: Mountain View, California, USA Specialty: Inference ASICs (LPU — Language Processing Unit) About: NVIDIA licensed Groq's technology for $20B — the largest deal in NVIDIA history — and unveiled the Groq 3 LPU at GTC 2026 three months later.

Cerebras Systems HQ: Sunnyvale, California, USA Specialty: Wafer-scale AI accelerators (WSE-3) About: Uses an entire silicon wafer as a single chip with 900,000 cores — targeting an April 2026 IPO at a $22–25B valuation after a transformative $10B+ OpenAI deal.

Etched HQ: San Jose, California, USA Specialty: Transformer-only inference ASIC (Sohu) About: The most concentrated bet in AI hardware — permanently burns the transformer architecture into silicon for a claimed 20x throughput advantage over GPU baselines.

Tenstorrent HQ: Toronto, Ontario, Canada (+ Santa Clara, California, USA) Specialty: RISC-V AI processors + open IP licensing About: Led by legendary chip architect Jim Keller; challenges NVIDIA through open architectures rather than closed silicon, with signed contracts from LG, Hyundai, and Samsung.

Tensordyne HQ: Sunnyvale, California, USA Specialty: Logarithmic number system inference ASIC About: The most technically radical approach — computes inference in the logarithmic domain for claimed 8x power efficiency vs NVIDIA GB200; first silicon tape-out targeted mid-2026.

The hyperscaler camp gets full analytical treatment in [Part 1 of our AI ASIC Market Report]. The independents get their own deep-dive in [Part 2].

The AI ASIC Market: How Big, How Fast

The AI ASIC market is one of the fastest-growing segments in technology. Three data points frame the opportunity.

First, the growth differential. TrendForce projects hyperscaler ASICs growing 44.6% in 2026 versus 16.1% for GPUs — the first year that AI ASIC growth has outpaced GPU growth since both categories started tracking separately. Second, the long-term forecast. Bloomberg Intelligence projects the custom AI ASIC market reaching $118 billion by 2033 at a 27% CAGR, rising from 8% of the total AI accelerator market in 2024 to 19% by 2033. Third, the broader accelerator market context. The total AI accelerator market — GPUs plus ASICs plus everything else — is projected to reach $604 billion by 2033 at 16% CAGR, up from $116 billion in 2024.

MarketsandMarkets segments the ASIC market across function (training, inference), deployment (on-premises, cloud, hybrid), and application (generative AI, machine learning, computer vision, natural language processing) — useful granularity for readers who want to understand how the market subdivides by workload type.

The capital backdrop supporting this growth is extraordinary. Bloomberg Intelligence projects $3.5 trillion in AI-related capital expenditures through 2030, with Microsoft alone guiding to $150B+ in 2026 CapEx. OpenAI's infrastructure roadmap could account for over $1 trillion through 2030. Against that backdrop, the ASIC market's share — roughly one-fifth of AI accelerator spend by 2033 — represents a structural shift in how hyperscalers deploy their infrastructure dollars.

One concentration point worth flagging: Broadcom and Marvell together control 80%+ of the custom AI ASIC design services market. When Google, Meta, and OpenAI decide to build custom silicon, they don't build it alone — they partner with one of these two firms. The AI ASIC wave is lifting a small number of boats very high, and Broadcom's AI semiconductor revenue alone grew 106% year-over-year in Q1 FY2026.

What This Means for Bitcoin Miners

The AI ASIC wave and the Bitcoin mining ASIC industry share the same fundamental thesis: custom silicon beats general-purpose compute for well-defined, high-volume workloads at scale. If that thesis worked for Bitcoin — and a decade of network hashrate growth from ~20 TH/s in 2013 to over 1,000 EH/s today makes clear that it did — it's working for AI. Miners who've spent years reasoning about application-specific silicon economics have a more intuitive grasp of the AI ASIC market than most generalist tech observers.

The infrastructure overlap is more direct than it first appears. AI ASICs need the same inputs as Bitcoin mining ASICs: power at scale, cooling capacity, data center real estate, advanced manufacturing at TSMC, and increasingly the same constrained supply chain for HBM memory and advanced packaging. Miners who understand power procurement, grid relationships, and cooling economics understand the hardest parts of AI infrastructure — and in many cases those are harder than the chip-level economics.

The competitive signal for hybrid operators is clear. As AI ASICs cut into GPU margins at 44.6% vs 16.1% growth, the economic case for pairing Bitcoin mining with AI inference deployments — what we've called mullet mining — gets stronger, not weaker. Purpose-built inference ASICs deployed alongside or adjacent to Bitcoin mining operations capture exactly the workload where specialized silicon wins most decisively, at exactly the sites where miners already have grid relationships, power contracts, and cooling capacity.

There's also a capital allocation signal worth sitting with. Investors evaluating Bitcoin mining infrastructure companies should understand that the "custom silicon at scale" thesis is now being validated by the largest technology companies on earth. Every dollar Broadcom earns designing TPUs for Google is a dollar of validation for the same industrial logic that drives Bitmain and MicroBT's businesses. The thesis has moved from "will custom silicon at scale work?" to "which workloads justify it?" — and the answer for both SHA-256 hashing and transformer inference is: yes, at industrial scale, with massive ongoing CapEx.

Latest Developments in AI ASICs

The AI ASIC market moves fast. These are the most significant developments shaping the landscape as of publication:

  • December 2025 — NVIDIA announces a $20 billion license and talent deal for Groq, the largest transaction in NVIDIA history. (Tom's Hardware)
  • January 2026 — Bloomberg Intelligence projects the custom AI ASIC market reaching $118 billion by 2033 at 27% CAGR. (Bloomberg Intelligence)
  • January 2026 — Etched raises $500M at a $5B valuation, led by Stripes with Peter Thiel and Ribbit Capital participating.
  • February 2026 — Cerebras closes a $1B Series H at a $22B valuation, following a $10B+ OpenAI compute deal announced weeks earlier.
  • March 2026 — NVIDIA unveils the Groq 3 LPU at GTC 2026 in San Jose, one of the fastest hardware integrations in semiconductor history. Cerebras announces an additional deal with Amazon.
  • April 2026 — Cerebras targets an April IPO window with Morgan Stanley as lead underwriter, raising $2B at a $22–25B valuation. (AI CERTs News)
  • April 2026 — Google announces its 8th-generation TPUs at Google Cloud Next: TPU 8t for training and TPU 8i for inference, the first time Google has split the TPU family into specialized training and inference chips. (Google)

For deeper analysis of the hyperscaler camp, see [Part 1 of our AI ASIC Market Report]. For the independent manufacturers, see [Part 2].

Frequently Asked Questions

What is an AI ASIC?

An AI ASIC is a chip designed specifically to run artificial intelligence workloads — most commonly neural network inference, training, or both — rather than general-purpose compute. Unlike GPUs, which retain flexibility for graphics and other workloads, AI ASICs strip away non-AI functionality to maximize throughput and power efficiency for specific AI operations like matrix multiplication and transformer inference. Common examples include Google's TPU, AWS Trainium and Inferentia, Groq's LPU, and Meta's MTIA.

What's the difference between an AI ASIC and a GPU?

GPUs are flexible parallel processors originally designed for graphics rendering; they run any AI workload because their parallelism suits matrix math. AI ASICs are purpose-built for specific AI operations and sacrifice flexibility for efficiency. The tradeoff: GPUs cost ~$30K–$40K per chip with a mature software ecosystem (CUDA) and work for any workload; AI ASICs require $10M–$100M+ in design costs but deliver 50–67% lower inference cost per token and 3–8x better power efficiency at scale. GPUs win for training and research; AI ASICs win for mature production inference at hyperscale volume.

What's the difference between an AI ASIC and a Bitcoin mining ASIC?

Both are application-specific integrated circuits — custom silicon built for one workload — but the workloads are completely different. Bitcoin mining ASICs perform one operation (SHA-256 hashing) trillions of times per second to secure the Bitcoin network and earn block rewards. AI ASICs perform matrix multiplications and tensor operations optimized for neural network architectures like transformers. Same underlying design philosophy (custom silicon for specific workloads), same manufacturing base (largely TSMC), but different math, different software, and different market dynamics. They're industrial cousins, not direct competitors.

Who makes AI ASICs?

AI ASIC makers divide into two camps. Hyperscalers build chips for their own cloud services: Google (TPU), AWS (Trainium, Inferentia), Microsoft (Maia), Meta (MTIA), and OpenAI (in development with Broadcom). Independent manufacturers build chips commercially or enable hyperscaler designs: Broadcom and Marvell provide ASIC design services to hyperscalers, while direct competitors include Groq (acquired by NVIDIA), Cerebras, Etched, Tenstorrent, and Tensordyne. Broadcom and Marvell together control 80%+ of the custom AI ASIC design services market.

What are the biggest AI ASIC companies?

By market impact, Broadcom is the largest AI ASIC company — it commands 60–80% of the custom AI ASIC design services market and guided to $100B+ in AI chip revenue by FY2027. Google runs the largest hyperscaler ASIC deployment, with TPU v7 Ironwood in its 7th generation. Cerebras is the largest independent AI ASIC company targeting a public listing, aiming for a $22–25B IPO in April 2026. NVIDIA became a significant AI ASIC company in December 2025 via its $20B Groq acquisition, adding LPU technology to its GPU-dominant portfolio.

Is NVIDIA an AI ASIC company?

NVIDIA is primarily a GPU company, but the December 2025 Groq acquisition added AI ASIC technology to its portfolio. The Groq LPU is a dedicated inference ASIC, and NVIDIA now sells GPU+LPU rack-scale systems targeting the decode phase of AI inference. NVIDIA also supports hyperscaler ASIC programs through NVLink Fusion, which allows custom ASICs to connect to NVIDIA's interconnect fabric. So the accurate answer: NVIDIA is a GPU company that now also makes AI ASICs and enables other companies' AI ASIC programs.

How does a TPU differ from a GPU?

A TPU (Tensor Processing Unit) is Google's family of AI ASICs, designed specifically for neural network workloads. GPUs are general-purpose parallel processors originally built for graphics. TPUs strip away graphics-specific hardware and optimize for matrix math at scale. Google claims its TPU v6 Trillium delivers roughly 4.7x better price-performance and 67% lower power consumption than high-end NVIDIA GPUs for inference workloads. TPUs run on Google's JAX and TensorFlow software stacks rather than NVIDIA's CUDA. The practical distinction: GPUs run any workload and are available as discrete chips; TPUs run optimized AI workloads and are available primarily through Google Cloud.

What is the difference between an LPU and a GPU?

An LPU (Language Processing Unit) is Groq's AI ASIC architecture, optimized specifically for large language model inference. Unlike GPUs, which use external HBM memory and flexible execution pipelines, LPUs use on-chip SRAM and a compiler-controlled deterministic execution model that eliminates the synchronization overhead GPUs incur. Groq claims its LPU delivers approximately 10x the throughput of standard GPUs for LLM inference at 90% lower power consumption per operation. NVIDIA acquired Groq's LPU technology for $20 billion in December 2025 and now sells it alongside its GPU lineup.

What's the difference between AI training and AI inference?

Training is the compute-intensive process of building a neural network model from raw data — it's a one-time capital event that requires massive parallel processing and flexible hardware. Inference is the process of running a trained model to generate outputs for user queries — it's a continuous operational cost that scales with every prompt. At companies with deployed AI products, inference costs now massively exceed training costs. This matters for hardware decisions: training rewards the flexibility of GPUs, while inference rewards the per-token efficiency of purpose-built AI ASICs.

How big is the AI ASIC market?

The custom AI ASIC market is projected to reach $118 billion by 2033 at a 27% compound annual growth rate, per Bloomberg Intelligence. ASICs will grow from 8% of the total AI accelerator market in 2024 to 19% by 2033. The broader AI accelerator market (GPUs + ASICs + other accelerators) is projected to reach $604 billion by 2033 from $116 billion in 2024. TrendForce separately projects hyperscaler AI ASICs growing 44.6% in 2026 versus 16.1% for GPUs — the first time ASIC growth has outpaced GPU growth in the AI era.

Why are AI ASICs growing faster than GPUs?

AI ASICs are growing faster because inference — the ongoing cost of running trained models — has become the dominant AI workload, and inference is where purpose-built silicon wins most decisively. OpenAI's internal 2024 figures showed inference running 15–118x more expensive than training depending on the model. At hyperscale, a 50–67% reduction in inference cost per token justifies the $10M–$100M+ design cost of a custom ASIC within a single year. Every major hyperscaler — Google, AWS, Microsoft, Meta, OpenAI — is now building custom AI silicon for this reason.

Do AI ASICs use HBM memory?

Most AI ASICs use HBM (High Bandwidth Memory), though some designs deliberately avoid it. AWS Trainium, Google TPUs, Meta MTIA, and Etched Sohu all use HBM3 or HBM3e. Notable exceptions: Groq's LPU uses on-chip SRAM instead of HBM to achieve deterministic execution, and Cerebras' wafer-scale engine uses massive on-die SRAM (44GB) to eliminate the HBM dependency entirely. Tenstorrent deliberately designs around lower-cost memory to compete on cost rather than peak throughput. SK Hynix supplies roughly 62% of HBM used in AI chips, and HBM supply constraints are a structural bottleneck for the entire AI accelerator market.

Can you buy AI ASIC stock?

Several AI ASIC companies trade publicly. Broadcom (NASDAQ: AVGO) is the largest pure-play AI ASIC design services company and the most direct public investment in the category. Marvell Technology (NASDAQ: MRVL) is the second-largest design services firm. NVIDIA (NASDAQ: NVDA) became a meaningful AI ASIC company via the December 2025 Groq acquisition. Cerebras Systems is targeting an April 2026 IPO on NASDAQ under the ticker CBRS at a $22–25B valuation. Most hyperscaler AI ASIC programs are embedded within parent companies (Google in Alphabet, AWS in Amazon, Maia in Microsoft, MTIA in Meta), so investment exposure comes through the parent stock rather than pure-play vehicles.

Are AI ASICs replacing NVIDIA GPUs?

AI ASICs aren't replacing NVIDIA GPUs — they're carving out specific workloads where purpose-built silicon wins, particularly high-volume production inference. NVIDIA's GPU market share may normalize from ~85–86% toward ~75% as ASICs capture inference workloads, but NVIDIA remains dominant for AI training, research, and mixed workloads where flexibility matters. NVIDIA's own response includes NVLink Fusion (opening its interconnect to third-party ASICs), the $20B Groq acquisition (adding LPU technology to its portfolio), and the Vera Rubin platform (a full-stack system approach). The accurate framing: AI ASICs are becoming the second pillar of AI infrastructure alongside GPUs, not a replacement for them.

How are AI ASICs manufactured?

Nearly all advanced AI ASICs are manufactured at TSMC (Taiwan Semiconductor Manufacturing Company), which produces approximately 92% of the world's advanced AI chips at 7nm process nodes and below. Google TPUs, AWS Trainium, Etched Sohu, Cerebras WSE-3, and Broadcom's AI networking chips all rely on TSMC fabrication. Samsung manufactures some AI chips including the Groq 3 LPU (at Samsung's Taylor, Texas facility). TSMC is investing $100 billion in five new U.S. fabs and expanding to Japan to reduce geopolitical concentration risk. The TSMC dependency is the single biggest supply chain constraint facing the entire AI hardware market.

AI/HPC

Ian Philpot

Marketing Director at Luxor Technology