Meta Spent $14 Billion to Win the AI Race. Its Next Model Still Isn't Ready.

Meta is spending money like a company that's terrified of falling behind. In the past month alone, it committed $14.3 billion to acquire 49% of Scale AI, poached its CEO Alexandr Wang to run a brand-new "superintelligence" lab, and sent Mark Zuckerberg on a personal recruiting blitz — reaching out directly via cold emails and WhatsApp messages, dangling seven- and eight-figure comp packages at researchers from Google, OpenAI, and Anthropic.

And yet: the company's next major AI model, internally codenamed Avocado, has been quietly delayed from March to at least May, according to a report from The New York Times. Performance, sources say, falls short of rivals.

This is the paradox at the heart of Meta's AI moment. The company is spending at a scale that would make most AI labs blush — and it still can't ship a model that keeps pace with Google and OpenAI. Understanding why tells you something important about where the AI race actually stands in 2026.

The Llama 4 Problem Nobody Wanted to Admit

To understand Avocado, you have to go back to Llama 4.

Llama 4 was supposed to be Meta's coming-out party in the frontier AI race. Instead, it became a cautionary tale. The release was delayed multiple times. When it finally shipped, the reception was underwhelming — benchmarks were mediocre, and performance on real-world tasks disappointed developers who had grown accustomed to Llama's competitive positioning.

Then it got worse. Meta was caught gaming a public AI leaderboard to make Llama 4 Maverick appear more capable than it actually was. The company submitted a specifically fine-tuned variant of the model optimized for benchmark performance — a version that didn't reflect what developers would actually get in production. In the tight-knit AI research community, this kind of move doesn't go unnoticed, and it didn't.

Llama 4 Behemoth — the largest, most expensive variant, teased back in April 2025 — still hasn't been released.

Zuckerberg acknowledged the situation directly, saying that two of Meta's top priorities are making Meta AI "the leading personal AI" and "building full general intelligence." The gap between that stated ambition and the actual state of the company's models has never been more visible.

Avocado: Another Delay, Familiar Pattern

Avocado was supposed to be the response — the model that showed Meta had fixed the problems with Llama 4 and could genuinely compete with GPT-4o, Gemini 1.5 Pro, and Claude 3.5 Sonnet.

Instead, per the NYT report, Avocado is getting bumped to May. Performance apparently still falls short of leading rivals.

This isn't just an engineering embarrassment. It has real strategic consequences. Every month Avocado slips is another month where:

Enterprise customers building on Meta's open models are evaluating alternatives
Developer mindshare shifts further toward APIs and ecosystems where Meta doesn't compete
The open-source AI narrative Meta has cultivated — "we'll democratize access to frontier models" — gets harder to sustain when the models aren't actually frontier-quality
Competitors lap the field — Google has shipped Gemini 2.0, OpenAI continues iterating on GPT-4o variants, and Anthropic released Claude 3.7 Sonnet with extended thinking

The delay also comes at an awkward moment. Meta recently boasted that Meta AI has reached one billion monthly users — a staggering number that's true in the technical sense, but somewhat misleading given that the "users" include anyone who encounters the Meta AI assistant embedded across Instagram, WhatsApp, Messenger, and Facebook. Organic intent to use Meta AI as a primary AI assistant is a very different metric.

What $14.3 Billion Is Actually Buying

The Scale AI deal is worth understanding in detail, because it's not a straightforward acquisition — and the structure tells you something about Meta's strategy and constraints.

Meta is not buying Scale AI. It's acquiring a 49% non-voting stake for $14.3 billion, and separately hiring Scale CEO Alexandr Wang to run a new internal AI lab focused on superintelligence. Wang will report directly to Zuckerberg while remaining on Scale's board.

This structure is deliberate. It follows what The Verge calls "Big Tech's established playbook" — getting the key talent and strategic alignment without triggering a full acquisition review. Microsoft did it with Inflection AI (hired Mustafa Suleyman and the team). Amazon did it with Adept. The goal is access to capability and talent while managing regulatory risk.

What does Meta actually get?

1. Alexandr Wang and his network. Wang, who became a billionaire at 21 building Scale, is one of the most connected people in AI infrastructure. He understands training data at a level few people do. And he apparently has the kind of recruiting pull that can attract people Zuckerberg couldn't get through normal channels.

2. Scale's data infrastructure and expertise. Scale's core business is training data: human annotation, RLHF pipelines, evaluation infrastructure. These are not glamorous, but they are decisive. The quality of training data and feedback loops is widely considered one of the key differentiators between models that feel genuinely capable and models that feel slightly off. Meta's data practices have been a recurring weakness.

3. Government AI relationships. Scale has a contract with the Department of Defense for an AI agent program and recently signed a five-year deal with Qatar. Meta, which is simultaneously defending itself in an FTC antitrust trial, may see value in the legitimacy and government relationships Scale brings.

4. A psychological reset. There's something to be said for the signal value of this deal: Zuckerberg is serious, he's willing to spend whatever it takes, and he's not outsourcing the effort to a distant research team. The new lab will have direct CEO accountability.

The Deeper Structural Problem

Money can solve a lot of problems in AI. It can buy compute, hire researchers, and acquire companies with training infrastructure. What it can't easily buy is the compound research progress that comes from years of focused iteration.

Meta's AI ambitions have been hampered by structural factors that predated the current investment cycle:

Organizational fragmentation. Meta's AI research was historically split between FAIR (Fundamental AI Research), the product teams, and various applied AI efforts. These groups had different incentives, different timelines, and often different views on what success looked like. Integrating them into a coherent push toward frontier model performance has been a persistent challenge.

Open source tension. Meta's decision to release Llama models openly has been strategically smart for community adoption and developer goodwill. But it creates a genuine tension: the best open-source release strategy and the best competitive strategy aren't the same thing. When you release your weights publicly, you're also releasing your training insights to competitors who can study the model and catch up faster.

Advertising revenue as the organizing principle. Meta's core business is advertising. Its AI investments are, at some level, justified by the hypothesis that better AI drives better engagement and better ad targeting. This creates subtle distortions: models optimized for Meta's product surfaces (social feeds, messaging) may not translate cleanly into the kind of general-purpose capability that impresses benchmarks and earns developer trust.

The benchmark problem. Getting caught gaming the Lmsys leaderboard with Llama 4 Maverick wasn't just a PR problem — it was a symptom. If your internal evaluation culture tolerates shipping a benchmark-optimized variant rather than the actual product model, something is off in how success is being measured.

Who Benefits From Meta's Stumble?

The delay of Avocado and the continued struggles of Llama 4 create opportunity for everyone else in the space.

Google has quietly become the most underrated AI lab in the race. Gemini 2.0 Flash, in particular, has impressed developers with its speed and capability at a price point that makes it genuinely competitive for high-volume production use cases. Google's integration of AI across Search, Workspace, and Cloud gives it distribution advantages Meta doesn't have in enterprise contexts.

Anthropic continues to benefit from developer trust — particularly among the segment of the market that cares about reliability and reasoning quality over raw benchmark performance. Claude 3.7 Sonnet's "extended thinking" mode has found strong product-market fit in coding and analysis use cases.

Open-source alternatives. The irony of Meta's open-source strategy is that the models it released previously have spawned an ecosystem that no longer depends on Meta shipping the best version. Mistral, DeepSeek, and others have absorbed the open-weight model template and iterated aggressively. If Llama 5 (or whatever follows Avocado) isn't meaningfully ahead of what the open-source community has built independently, Meta's rationale for controlling the "leading open model" narrative weakens considerably.

What to Watch

A few things will tell us whether the Scale AI investment is transformative or just expensive:

1. Avocado's eventual benchmark performance. If it ships in May and genuinely competes with GPT-4o and Gemini 1.5 Pro on third-party evaluations — not just Meta's own — that's a sign the structural changes are working.

2. Wang's recruiting results. If the new superintelligence lab lands a cohort of top-tier researchers in the next 90 days, the investment is paying off in the coin that matters most in AI: talent density.

3. Llama 4 Behemoth. The unreleased "largest and most expensive" Llama 4 variant is a wild card. If Meta finally ships it with genuinely impressive performance, it partially rehabilitates the Llama 4 story and buys credibility for Avocado.

4. Whether the open-source strategy holds. There's a real question about whether Meta will continue releasing frontier model weights openly once it's genuinely competitive. The incentive to do so decreases as the strategic stakes increase.

The Bottom Line

Meta is not losing the AI race in the way that "losing" usually looks — a company hemorrhaging customers and talent. It's losing in a subtler, more dangerous way: it's spending enormous sums while falling further behind the technical frontier, and the organizational changes it's making to fix that are unproven.

The Scale AI deal is the right kind of bet — it addresses the data and talent gap that's been holding Meta back more than raw compute spending ever could. Wang is a legitimate upgrade over what Meta had. The new superintelligence lab, if Zuckerberg runs it with the same focus he brought to the metaverse pivot (for better or worse), will at least have CEO-level visibility and urgency.

But Avocado being delayed is a reminder that you can't buy your way to compound research progress overnight. The models that are winning right now — GPT-4o, Gemini 2.0, Claude 3.7 — are the product of years of focused iteration, not a single large check.

Meta has the resources to close the gap. The question is whether the organizational changes it's making will actually produce the kind of research culture that can ship competitive frontier models on a predictable cadence.

Avocado is the first real test. May can't come soon enough.

Sources: The Verge — Meta is paying $14 billion to catch up in the AI race | The Verge — Meta Llama 4 Maverick benchmarks gaming | The New York Times — Meta AI model delay

The Llama 4 Problem Nobody Wanted to Admit

To understand Avocado, you have to go back to Llama 4.

Llama 4 Behemoth — the largest, most expensive variant, teased back in April 2025 — still hasn't been released.

Avocado: Another Delay, Familiar Pattern

Avocado was supposed to be the response — the model that showed Meta had fixed the problems with Llama 4 and could genuinely compete with GPT-4o, Gemini 1.5 Pro, and Claude 3.5 Sonnet.

Instead, per the NYT report, Avocado is getting bumped to May. Performance apparently still falls short of leading rivals.

This isn't just an engineering embarrassment. It has real strategic consequences. Every month Avocado slips is another month where:

Enterprise customers building on Meta's open models are evaluating alternatives
Developer mindshare shifts further toward APIs and ecosystems where Meta doesn't compete
The open-source AI narrative Meta has cultivated — "we'll democratize access to frontier models" — gets harder to sustain when the models aren't actually frontier-quality
Competitors lap the field — Google has shipped Gemini 2.0, OpenAI continues iterating on GPT-4o variants, and Anthropic released Claude 3.7 Sonnet with extended thinking

Meta Spent $14 Billion to Win the AI Race. Its Next Model Still Isn't Ready.

The Llama 4 Problem Nobody Wanted to Admit

Avocado: Another Delay, Familiar Pattern

What $14.3 Billion Is Actually Buying

The Deeper Structural Problem

Who Benefits From Meta's Stumble?

What to Watch

The Bottom Line

Related Posts

Speculative Decoding in Production: How a 1B Draft Model Cuts 70B Latency by 3-5×

AI Solved a Frontier Math Problem This Week. It Also Scored 1% on Tasks a Child Masters in Minutes.

How LinkedIn Replaced Five Retrieval Systems with One LLM at 1.3 Billion User Scale

Comments

Leave a comment

Meta Spent $14 Billion to Win the AI Race. Its Next Model Still Isn't Ready.

The Llama 4 Problem Nobody Wanted to Admit

Avocado: Another Delay, Familiar Pattern

What $14.3 Billion Is Actually Buying

The Deeper Structural Problem

Who Benefits From Meta's Stumble?

What to Watch

The Bottom Line

Related Posts

Speculative Decoding in Production: How a 1B Draft Model Cuts 70B Latency by 3-5×

AI Solved a Frontier Math Problem This Week. It Also Scored 1% on Tasks a Child Masters in Minutes.

How LinkedIn Replaced Five Retrieval Systems with One LLM at 1.3 Billion User Scale

Comments

Leave a comment