Model Releases

New foundation models, benchmarks, and capability breakthroughs.

50 articles

Model Release

Anthropic Fable 5 Beats All AI Rivals on Code 2026

1 days ago

Model Release 12 min

Anthropic Fable 5 Beats All AI Rivals on Code 2026

Anthropic's Fable 5 tops SWE-Bench Pro at 80.3%, beating GPT-5.5 by 22 points, opening the Mythos-class tier at $10 per million tokens.

Jordan Hale1 days ago

Model Release

Apple Builds AFM Models With Google Gemini and Nvidia GPUs

2 days ago

Model Release 12 min

Apple Builds AFM Models With Google Gemini and Nvidia GPUs

Apple unveiled 5 AFM foundation models at WWDC 2026, distilling Google Gemini and running the cloud tier on Nvidia GPUs in Google Cloud infrastructure.

Jordan Hale2 days ago

Model Release

OpenAI GPT-5.6 Leak Signals 1.5M Context June Launch 2026

4 days ago

Model Release 14 min

OpenAI GPT-5.6 Leak Signals 1.5M Context June Launch 2026

OpenAI's GPT-5.6 appears in Codex routing logs under codename iris-alpha, with 1.5M tokens of context and 80-89% Polymarket odds of a June launch.

Jordan Hale4 days ago

Model Release

OpenAI Kills GPT-4.5 to Signal the GPT-5 Era in 2026

4 days ago

Model Release 12 min

OpenAI Kills GPT-4.5 to Signal the GPT-5 Era in 2026

OpenAI will retire GPT-4.5 from ChatGPT on June 27 and o3 on August 26, completing the first full clearout of the pre-GPT-5 model generation.

Jordan Hale4 days ago

Model Release

Alibaba Qwen 3.7 Plus Launches Vision Agents at 6x Cut

5 days ago

Model Release 12 min

Alibaba Qwen 3.7 Plus Launches Vision Agents at 6x Cut

Alibaba's Qwen 3.7 Plus adds vision and video to its Qwen backbone at $0.40 per million tokens, 6x cheaper than Qwen 3.7 Max, shipping June 1.

Jordan Hale5 days ago

Model Release

Microsoft MAI-Thinking-1 Beats Claude Sonnet in 2026

6 days ago

Model Release 11 min

Microsoft MAI-Thinking-1 Beats Claude Sonnet in 2026

Microsoft's MAI-Thinking-1, its first in-house reasoning model trained without OpenAI data, was preferred over Claude Sonnet 4.6 in blind human tests.

Jordan Hale6 days ago

Model Release

Anthropic Sonnet 4.8 Leak Signals It Skips 4.7 in 2026

6 days ago

Model Release 12 min

Anthropic Sonnet 4.8 Leak Signals It Skips 4.7 in 2026

Anthropic's leaked 512,000-line Claude Code source map exposes Sonnet 4.8 and a skipped 4.7, plus hidden API tiers and routing thresholds rivals can study.

Jordan Hale6 days ago

Model Release

Gemini 3.5 Pro Doubles Context to 2M Tokens in 2026

6 days ago

Model Release 12 min

Gemini 3.5 Pro Doubles Context to 2M Tokens in 2026

Gemini 3.5 Pro ships a 2 million token context window, double Flash, as Google launches its cheaper model first to pressure rivals on price.

Jordan Hale6 days ago

Model Release

Microsoft MAI-Code-1-Flash Cuts Coding Tokens 60% 2026

6 days ago

Model Release 12 min

Microsoft MAI-Code-1-Flash Cuts Coding Tokens 60% 2026

Microsoft MAI-Code-1-Flash, a 5B in-house coding model, beats Claude Haiku 4.5 on SWE-Bench Pro and uses up to 60% fewer tokens in GitHub Copilot.

Jordan Hale6 days ago

Model Release

Google Gemma 4 Launches a 12B Encoder Free Local Model

6 days ago

Model Release 12 min

Google Gemma 4 Launches a 12B Encoder Free Local Model

Google Gemma 4 12B is an encoder-free multimodal model that runs text, image, audio, and video on a 16GB laptop under Apache 2.0 with a 256K context.

Jordan Hale6 days ago

Model Release

Microsoft MAI Launches 7 Models to Cut OpenAI Reliance

6 days ago

Model Release 12 min

Microsoft MAI Launches 7 Models to Cut OpenAI Reliance

Microsoft launched 7 MAI models spanning code, reasoning, image, voice, and transcription at Build 2026, a full-stack bid to cut its OpenAI reliance.

Jordan Hale6 days ago

Model Release

Claude Opus 4.8 Surpasses GPT-5.5 as the Top AI Model

6 days ago

Model Release 12 min

Claude Opus 4.8 Surpasses GPT-5.5 as the Top AI Model

Claude Opus 4.8 scores 61.4 on the Artificial Analysis Intelligence Index and 69.2% on SWE-Bench Pro, beating GPT-5.5 just 41 days after Opus 4.7.

Jordan Hale6 days ago

Model Release

OpenAI Realtime 2 Cuts Voice AI Latency Below 1 Second

6 days ago

Model Release 12 min

OpenAI Realtime 2 Cuts Voice AI Latency Below 1 Second

OpenAI's GPT-Realtime-2 cuts voice agent latency below one second with GPT-5-class reasoning and a 128K context, resetting voice AI economics.

Jordan Hale6 days ago

Model Release

DeepSeek V4 Pro Undercuts US AI Coding Cost 95% 2026

Jun 5, 2026

Model Release 12 min

DeepSeek V4 Pro Undercuts US AI Coding Cost 95% 2026

DeepSeek V4-Pro made its 75% price cut permanent at $0.87 per million tokens, a frontier-grade coding model that undercuts US labs with open weights.

Jordan HaleJun 5, 2026

Model Release

MiniMax M3 Beats GPT-5.5 Coding at 1/12 Cost in 2026

Jun 5, 2026

Model Release 12 min

MiniMax M3 Beats GPT-5.5 Coding at 1/12 Cost in 2026

MiniMax M3, an open-weight model from Shanghai, beats GPT-5.5 on SWE-Bench Pro at 1/12 the cost, putting the AI coding frontier up for free download.

Jordan HaleJun 5, 2026

Model Release

Ideogram 4 Launches 9.3B Open-Weight Image Model 2026

Jun 5, 2026

Model Release 12 min

Ideogram 4 Launches 9.3B Open-Weight Image Model 2026

Ideogram 4 is a 9.3B open-weight text-to-image model with native 2K resolution that tops every open rival on text rendering and design quality.

Jordan HaleJun 5, 2026

Model Release

xAI Composer 2.5 Beats Grok 4.20 on Agentic Code 2026

Jun 5, 2026

Model Release 12 min

xAI Composer 2.5 Beats Grok 4.20 on Agentic Code 2026

xAI Composer 2.5 launched June 1, 2026, scoring 69.3 on agentic coding versus Grok 4.20 at 47.1, built on Moonshot Kimi K2.5 open weights.

Jordan HaleJun 5, 2026

Model Release

Qwen3.7 Max Surpasses Gemini Flash on AI Index 2026

Jun 5, 2026

Model Release 12 min

Qwen3.7 Max Surpasses Gemini Flash on AI Index 2026

Qwen3.7 Max scored 56.6 on the Artificial Analysis Index on May 20, 2026, the top Chinese model, beating DeepSeek V4 Pro on agentic coding.

Jordan HaleJun 5, 2026

Model Release

OpenAI Rosalind Cuts Genomics AI Compute 31% in 2026

Jun 4, 2026

Model Release 12 min

OpenAI Rosalind Cuts Genomics AI Compute 31% in 2026

OpenAI GPT-Rosalind beats GPT-5.5 across drug discovery and genomics while using 31% fewer tokens, and opens its science model preview worldwide.

Jordan HaleJun 4, 2026

Model Release

Nvidia Cosmos 3 Launches Open Physical AI Omnimodel

Jun 4, 2026

Model Release 12 min

Nvidia Cosmos 3 Launches Open Physical AI Omnimodel

Nvidia Cosmos 3 launches as the first fully open physical AI omnimodel, generating video and robot actions to cut robotics training from months to days.

Jordan HaleJun 4, 2026

Model Release

Google Gemma 4 Beats 400B Open Rivals on a 31B Model

Jun 4, 2026

Model Release 12 min

Google Gemma 4 Beats 400B Open Rivals on a 31B Model

Google Gemma 4 ships under Apache 2.0 and its 31B model beats 400B open rivals on agentic and coding tests while running on a single GPU.

Jordan HaleJun 4, 2026

Model Release

Nvidia Nemotron 3 Ultra Cuts Agent Cost 30% in 2026

Jun 4, 2026

Model Release 14 min

Nvidia Nemotron 3 Ultra Cuts Agent Cost 30% in 2026

Nvidia Nemotron 3 Ultra is a 550B open model with 55B active params running agents 5x faster at 30% lower cost, the fastest US open model in 2026.

Jordan HaleJun 4, 2026

Model Release

Microsoft Polaris Ends OpenAI Reliance in Copilot 2026

Jun 4, 2026

Model Release 13 min

Microsoft Polaris Ends OpenAI Reliance in Copilot 2026

Microsoft Polaris replaces GPT-4 Turbo as the default GitHub Copilot model in August 2026, running on in-house Maia chips and cutting OpenAI reliance.

Jordan HaleJun 4, 2026

Model Release

Gemini 3.5 Flash Beats Pro at 76% on Terminal Bench

Jun 4, 2026

Model Release 13 min

Gemini 3.5 Flash Beats Pro at 76% on Terminal Bench

Google's Gemini 3.5 Flash scores 76.2% on Terminal-Bench 2.1, the first Flash-tier model to beat its own Pro tier on coding and agentic tasks.

Jordan HaleJun 4, 2026

Model Release

Meta Muse Spark Delay Signals a Llama Crisis in 2026

Jun 4, 2026

Model Release 12 min

Meta Muse Spark Delay Signals a Llama Crisis in 2026

Meta has repeatedly delayed its Muse Spark API, the developer gateway to its next AI model, as rivals OpenAI and Anthropic ship live and pull ahead.

Jordan HaleJun 4, 2026

Model Release

Nvidia GR00T N2 Doubles Humanoid Robot Success in 2026

Jun 3, 2026

Model Release 12 min

Nvidia GR00T N2 Doubles Humanoid Robot Success in 2026

Nvidia GR00T N2 tops MolmoSpaces and RoboArena with 2x the rival success rate, a world action model for humanoid robots shipping in late 2026.

Jordan HaleJun 3, 2026

Model Release

OpenAI Ends GPT-4.5 in ChatGPT to Force GPT-5.5 Upgrade

Jun 3, 2026

Model Release 12 min

OpenAI Ends GPT-4.5 in ChatGPT to Force GPT-5.5 Upgrade

OpenAI retires GPT-4.5 from ChatGPT on June 27, routing users to GPT-5.5 and ending the era of user-selectable model choice in its consumer app.

Jordan HaleJun 3, 2026

Model Release

Microsoft MAI-Thinking-1 Ends OpenAI Data Dependence

Jun 3, 2026

Model Release 12 min

Microsoft MAI-Thinking-1 Ends OpenAI Data Dependence

Microsoft MAI-Thinking-1 is its first reasoning model trained with zero OpenAI data, scoring 97% on AIME 2025 and matching Claude Opus 4.6 on SWE-Bench.

Jordan HaleJun 3, 2026

Model Release

Microsoft MAI-Code-1-Flash Undercuts Claude Haiku 2026

Jun 3, 2026

Model Release 12 min

Microsoft MAI-Code-1-Flash Undercuts Claude Haiku 2026

Microsoft MAI-Code-1-Flash beats Claude Haiku 4.5 and cuts token use 60 percent as GitHub Copilot moves to metered credit billing this month.

Jordan HaleJun 3, 2026

Model Release

xAI Grok 4.3 Cuts Cost 20 Percent and Tops Legal AI

Jun 3, 2026

Model Release 13 min

xAI Grok 4.3 Cuts Cost 20 Percent and Tops Legal AI

xAI's Grok 4.3 scores ELO 1500 on GDPval-AA and ranks first on CaseLaw v2, while cutting benchmark costs 20 percent below Grok 4.20.

Jordan HaleJun 3, 2026

Model Release

Microsoft MAI Beats Claude Sonnet at 10x Lower Cost

Jun 3, 2026

Model Release 12 min

Microsoft MAI Beats Claude Sonnet at 10x Lower Cost

Microsoft's MAI-Thinking-1 matches Claude Sonnet 4.6 in blind tests at 10x lower cost, signaling a new phase of AI model competition.

Jordan HaleJun 3, 2026

Model Release

Claude Opus 4.8 Beats GPT-5.5 on SWE-bench Pro 2026

Jun 2, 2026

Model Release 14 min

Claude Opus 4.8 Beats GPT-5.5 on SWE-bench Pro 2026

Claude Opus 4.8 scores 69.2% on SWE-bench Pro, beating GPT-5.5 at 58.6%, while its 3x fast mode makes it the cheapest frontier coding model available.

Jordan HaleJun 2, 2026

Model Release

OpenAI GPT-5.5 Instant Replaces ChatGPT Default 2026

Jun 2, 2026

Model Release 12 min

OpenAI GPT-5.5 Instant Replaces ChatGPT Default 2026

OpenAI replaces ChatGPT default with GPT-5.5 Instant, improving accuracy, STEM, and image understanding for all users. GPT-5.6 arrives June 2026.

Jordan HaleJun 2, 2026

Model Release

Google Gemini Omni Ends Single-Modality AI Era 2026

Jun 2, 2026

Model Release 12 min

Google Gemini Omni Ends Single-Modality AI Era 2026

Google Gemini Omni Flash generates video from any input in a unified reasoning pass, the first native any-to-any multimodal AI at I/O 2026.

Jordan HaleJun 2, 2026

Model Release

OpenAI GPT-5.4 Beats Humans at Computer Use in 2026

Jun 2, 2026

Model Release 12 min

OpenAI GPT-5.4 Beats Humans at Computer Use in 2026

OpenAI GPT-5.4 scores 75.0% on OSWorld-Verified, beating the 72.4% human baseline, with a 1M token context window for long autonomous agent work.

Jordan HaleJun 2, 2026

Model Release

Nvidia Alpamayo 2 Launches a Free Robotaxi Brain in 2026

Jun 2, 2026

Model Release 12 min

Nvidia Alpamayo 2 Launches a Free Robotaxi Brain in 2026

Nvidia's Alpamayo 2 Super is a 32B open reasoning model for level 4 robotaxis, free on Hugging Face this summer, turning autonomy into a commodity.

Jordan HaleJun 2, 2026

Model Release

MiniMax M3 Beats GPT-5.5 at Just 10% of Coding Cost

Jun 2, 2026

Model Release 12 min

MiniMax M3 Beats GPT-5.5 at Just 10% of Coding Cost

MiniMax M3 scores 59% on SWE-Bench Pro, beating GPT-5.5 and Gemini 3.1 Pro at roughly 10% of the cost, with open weights due within ten days.

Jordan HaleJun 2, 2026

Model Release

Cohere Command A+ Breaks the 200B Open Model Barrier

Jun 1, 2026

Model Release 13 min

Cohere Command A+ Breaks the 200B Open Model Barrier

Cohere Command A+ ships 218B parameters under Apache 2.0 and runs on two H100 GPUs, the first 200B-plus open model with native citation grounding built in.

Jordan HaleJun 1, 2026

Model Release

Qwen 3.7 Max Beats Claude Opus 4.6 on GPQA Diamond

Jun 1, 2026

Model Release 13 min

Qwen 3.7 Max Beats Claude Opus 4.6 on GPQA Diamond

Qwen 3.7 Max scores 92.4 on GPQA Diamond with a 1M-token window and $2.50 input pricing, undercutting GPT-5.5 on cost while topping Claude on reasoning.

Jordan HaleJun 1, 2026

Model Release

Nvidia Cosmos 3 Launches First Open Physical AI Model

Jun 1, 2026

Model Release 13 min

Nvidia Cosmos 3 Launches First Open Physical AI Model

Nvidia's Cosmos 3 is the first open Physical AI omnimodel, ranking number one across seven robotics benchmarks in 8B Nano and 32B Super sizes.

Jordan HaleJun 1, 2026

Model Release

Nvidia Nemotron 3 Ultra Beats US Open Weight Rivals

Jun 1, 2026

Model Release 13 min

Nvidia Nemotron 3 Ultra Beats US Open Weight Rivals

Nvidia's Nemotron 3 Ultra is a 550B open model that tops US open-weights rankings at 300 tokens per second, with weights and training recipes released.

Jordan HaleJun 1, 2026

Model Release

DeepSeek V4 Cuts Frontier AI Coding Cost 75 Percent

Jun 1, 2026

Model Release 13 min

DeepSeek V4 Cuts Frontier AI Coding Cost 75 Percent

DeepSeek V4 makes its 75 percent price cut permanent at 0.435 per million input tokens, matching GPT-5.5 on agentic coding under an open MIT license.

Jordan HaleJun 1, 2026

Model Release

Google Gemma 4 31B Beats 400B Rivals at Zero License Cost

May 30, 2026

Model Release 13 min

Google Gemma 4 31B Beats 400B Rivals at Zero License Cost

Google Gemma 4 31B scores 89.2% on AIME 2026 and beats Llama 4 across math, code, and agentic tasks, all under a free Apache 2.0 open license.

Jordan HaleMay 30, 2026

Model Release

Claude Opus 4.8 Launches Agent Swarms for Complex Work

May 30, 2026

Model Release 12 min

Claude Opus 4.8 Launches Agent Swarms for Complex Work

Claude Opus 4.8 hits 88.6% on SWE-bench Verified and adds Dynamic Workflows to run hundreds of subagents at once, all at the prior Opus price.

Jordan HaleMay 30, 2026

Model Release

Gemini 3.5 Flash Beats 3.1 Pro and Cuts Cost to $1.50

May 30, 2026

Model Release 12 min

Gemini 3.5 Flash Beats 3.1 Pro and Cuts Cost to $1.50

Gemini 3.5 Flash is now generally available at $1.50 per million input tokens, beating Gemini 3.1 Pro on coding while running 4x faster.

Jordan HaleMay 30, 2026

Model Release

Google AI Ultra Cuts Price 60% and Undercuts OpenAI

May 29, 2026

Model Release 13 min

Google AI Ultra Cuts Price 60% and Undercuts OpenAI

Google cut its AI Ultra plan 60% from $250 to $100 a month and launched Gemini 3.5 Flash, undercutting OpenAI and Anthropic on frontier AI pricing.

Jordan HaleMay 29, 2026

Model Release

Meta Muse Spark: Capable Enough to Keep Closed

May 15, 2026

Model Release 11 min

Meta Muse Spark: Capable Enough to Keep Closed

Meta's first model under Alexandr Wang hits the top 5 on benchmarks but fails its safety review, marking the end of Meta's open-source AI strategy.

Jordan HaleMay 15, 2026

Model Release

NVIDIA Nemotron 3 Super Tops Open Model Agentic AI

May 15, 2026

Model Release 12 min

NVIDIA Nemotron 3 Super Tops Open Model Agentic AI

NVIDIA's Nemotron 3 Super scores 85.6% on PinchBench with 5x throughput over its predecessor, the strongest open agentic model for enterprise deployments.

Jordan HaleMay 15, 2026

Model Release

SubQ's 12M-Token Window Costs 1,000x Less Than GPT

May 15, 2026

Model Release 12 min

SubQ's 12M-Token Window Costs 1,000x Less Than GPT

Subquadratic's SubQ uses sparse attention to process 12 million tokens at 1/1,000th the compute of frontier models, raising $29M at a $500M valuation.

Jordan HaleMay 15, 2026

Model Release

Mira Murati's TML Model Ends the Turn-Taking Era of Voice AI

May 14, 2026

Model Release 11 min

Mira Murati's TML Model Ends the Turn-Taking Era of Voice AI

Thinking Machines Lab's TML-Interaction-Small scores 77.8 on FD-bench versus 54.3 for Gemini, built by Mira Murati to replace turn-based voice AI.

Jordan HaleMay 14, 2026