Eight hours. That's how long Sakana AI's new research agent, Marlin, runs autonomously before surfacing its findings. No calls for clarification. No billing in six-minute increments. No partner availability gaps. Sakana AI, the Tokyo-based startup founded by former Google Brain researchers and backed by $379 million in venture funding at a $2.65 billion valuation, launched Marlin commercially on June 15, 2026, and in doing so made the most direct claim yet that AI agents can replace the strategic research function of a consulting firm, not assist it, not augment it, but replace it entirely.
What Actually Happened
Sakana AI officially launched Marlin for enterprise customers on June 15, 2026, following a closed beta that ran from April through early June and involved approximately 300 professionals across financial institutions, operating companies, consulting firms, and think tanks. Marlin is an autonomous AI research agent that runs uninterrupted reasoning sessions of up to eight hours, producing structured outputs including detailed reports of more than 100 pages and executive slide decks formatted for board-level consumption. The product is available immediately through a pay-as-you-go pricing tier and is restricted to corporate entities, not individual consumers. According to VentureBeat, Marlin is designed to emulate the work of a Chief Strategy Officer and an entire research team, compressing several weeks of strategic analysis into a single unattended session.
The technology underlying Marlin is architecturally different from what powers consumer deep-research tools. Marlin is built on two research breakthroughs that Sakana has developed over the past two years. The first is AB-MCTS, or Adaptive Branching Monte Carlo Tree Search, a reasoning algorithm that received a Spotlight Award at NeurIPS 2025, placing it among the most recognized AI research contributions of that year. The second is a workflow automation framework derived from Sakana's AI Scientist project, which demonstrated fully autonomous scientific discovery from hypothesis generation to peer-reviewed publication, and was published in Nature. These are not consumer-grade tools repurposed for enterprise; they represent research-grade methods applied to business strategy for the first time.
Marlin's commercial launch represents Sakana AI's first step from research organization to revenue-generating business. The company, co-founded by David Ha (formerly Google Brain and Stability AI) and Llion Jones (co-author of the original Transformer paper), has produced a stream of well-regarded AI research papers but had not previously commercialized any of it. According to Sakana AI's official announcement, Marlin is the company's first product designed to generate revenue at scale, and the enterprise-only positioning reflects both the seriousness of the use case and the liability considerations that come with autonomous AI producing strategic recommendations for corporate decision-makers.
Why This Matters More Than People Think
The management consulting industry collectively bills approximately $300 billion annually for services that range from strategy research to operational transformation. The highest-value slice of that market, the strategy research and synthesis that senior partners deliver to CEOs and boards, is also the slice that AI is most plausibly positioned to automate first. Unlike process transformation work, which requires organizational change management and implementation, strategy research is fundamentally a knowledge synthesis task: gather information, apply analytical frameworks, develop recommendations, and present them with authority. Marlin's claim is that it can do the synthesis part in 8 hours for a fraction of the cost of a McKinsey partner week.
The distinction between Marlin and consumer deep-research tools is important and often overlooked in first-pass coverage. OpenAI's Deep Research function, Google Gemini's research mode, and Perplexity's research assistant all operate in sessions measured in minutes, typically 5 to 20 minutes of active computation. They are designed to answer questions quickly, not to develop a strategic thesis over extended time. Marlin operates in an entirely different regime: eight hours of continuous, self-governing reasoning loops in which the agent identifies what it does not yet know, searches for that information, updates its model of the problem, branches into sub-questions, synthesizes across sources, and iterates until it reaches a defensible conclusion. The output is qualitatively different not just in length but in the depth of reasoning that sustained autonomous operation makes possible.
The timing of Marlin's launch also matters. Enterprise AI spending is accelerating in 2026, with CIOs and CFOs under pressure to demonstrate measurable productivity gains from AI investments that have so far produced broad-based efficiency improvements but limited transformation of high-value white-collar knowledge work. Marlin is positioned specifically as an answer to that pressure. Rather than making a consultant 20% faster, which is the promise of Copilot-style tools, Marlin's implicit proposition is that the research deliverable does not require a consultant at all. For CFOs who have been watching AI investment produce incremental gains, a tool that claims to eliminate a category of high-cost professional services spending represents a different kind of value proposition entirely.
The Competitive Landscape
No direct competitor currently occupies the same product space as Marlin. According to The Decoder, Marlin's closest analogues in the market are consumer deep-research tools that operate at fundamentally shorter time horizons and lower output complexity. OpenAI, Anthropic, and Google all have AI research assistance products, but all are designed around response times measured in minutes rather than autonomous multi-hour research cycles. The closest historical parallel is the Bloomberg Terminal in the 1980s, which did not simply speed up what financial analysts already did: it changed what was possible for an individual analyst to accomplish in a single session, and eventually changed who could perform that work at all. Marlin makes a similar claim for strategy research.
The consulting firms themselves have not been idle in AI. McKinsey's QuantumBlack, BCG's BCG X, and Bain's advanced analytics practices have all invested in AI-assisted research tooling for their own consultants. However, these are internal productivity tools that make existing consultants faster, not products that replace the consulting engagement itself. The competitive logic for Marlin runs at a different level: rather than competing with McKinsey for the same enterprise customers by making research faster, Marlin positions itself as the reason those customers question whether they need McKinsey in the first place. That is a fundamentally different competitive attack on a $300 billion industry than anything the consulting firms' own AI investments are designed to defend against.
The historical parallel that best captures Marlin's strategic position is the arrival of legal research databases like LexisNexis and Westlaw in the 1980s and 1990s. Those tools did not immediately eliminate legal associates: they initially made each associate more productive and allowed firms to staff matters more leanly. But over 20 years, they fundamentally changed the economics of legal research, compressed the time required for first-year associate training, and enabled new market entrants to offer legal services at lower price points. Marlin's multi-hour autonomous sessions represent the same kind of structural shift for strategy research: an initial phase of productivity augmentation that gradually enables a different organizational structure for the clients who adopt it most aggressively.
Hidden Insight: AB-MCTS Changes What Research Can Produce
The most important technical detail about Marlin is not the 8-hour session length but the algorithm driving those sessions. AB-MCTS, or Adaptive Branching Monte Carlo Tree Search, is the same class of algorithm that powered AlphaGo's superhuman performance at the game of Go. In Go, MCTS explores a tree of possible moves, allocating computational resources to the most promising branches based on ongoing assessment of each branch's value. Marlin applies this same logic to knowledge exploration: rather than searching research topics linearly, it branches simultaneously into multiple sub-questions, evaluates which threads are most productive, allocates more reasoning cycles to those threads, and prunes branches that are not yielding new information. This is not a faster version of keyword search; it is a fundamentally different cognitive architecture for exploring an information space.
The compound-interest analogy helps clarify why sustained sessions matter. A 20-minute deep-research session can identify the key facts about a topic and organize them coherently. An 8-hour session that re-evaluates its own findings mid-session, discovers that a key assumption was wrong, restructures the entire analytical framework, and then re-investigates from the corrected premise can produce a qualitatively different document than the same agent operating for a fraction of the time. The length is not padding; it is the mechanism by which the agent iterates toward a more accurate model of a complex business question. Marlin's NeurIPS Spotlight recognition for AB-MCTS suggests that the research community has independently validated that this algorithmic approach produces demonstrably better outputs than simpler alternatives.
The business model implication is where the disruption potential becomes clearest. Management consulting's pricing model rests on the scarcity of experienced strategic judgment and the transaction costs of assembling a team with the right expertise for each engagement. A Marlin session produces a 100-page strategy document and executive slides for a pay-as-you-go fee that, even at premium enterprise pricing, will be orders of magnitude cheaper than a McKinsey strategy sprint. If the output quality is comparable, the economic incentive for enterprises to shift even a fraction of their consulting spend to Marlin is compelling. Consulting firms' response, that a signed partner and a named team provide accountability and client relationships that an AI agent cannot, is real but applies most strongly to implementation work rather than research deliverables.
However, the bear case for Marlin is substantive. Critics argue that enterprise buyers have seen AI research tools overpromise and underdeliver at every price point since 2023. The critical difference between a 100-page AI-generated report and a 100-page McKinsey report is not the length or even necessarily the analytical quality: it is the legal accountability, the reputational skin-in-the-game of named human experts, and the organizational credibility that a top-tier consulting firm's endorsement provides. When a board makes a billion-dollar strategic decision based on a McKinsey recommendation and it goes wrong, there is a named party responsible. When a decision is based on a Marlin session, the accountability structure is fundamentally different, and enterprise legal and governance teams will not overlook that gap easily. The risk is that Marlin finds itself adopted for lower-stakes research tasks while the highest-value consulting engagements remain out of reach.
What to Watch Next
The most important near-term development to track is Marlin's first public case studies from the enterprise beta program. The 300 professionals who participated in the closed beta across financial institutions, consulting firms, and think tanks represent a population that knows exactly what good strategy research looks like. If Sakana AI releases even two or three detailed case studies in the next 30 to 60 days showing how Marlin's outputs were used in real decisions, that evidence would do more to validate the product's commercial viability than any benchmark comparison. The alternative, a launch followed by silence on customer outcomes, would signal that the beta results were not compelling enough to be made public.
In the 90-day window, watch for whether any of the major AI frontier labs respond with their own longer-session enterprise research products. OpenAI and Anthropic both have the technical capability to build something similar to Marlin; the question is whether Sakana's product demonstrates sufficient market demand to make it a priority. Anthropic's enterprise strategy has been focused on Claude's coding and analysis capabilities rather than autonomous multi-hour research sessions, but a commercially successful Marlin could accelerate Anthropic's roadmap in this direction. OpenAI, with its broader enterprise footprint and partnership with Salesforce, has more distribution to leverage if it builds a competing product. The first 90 days will reveal whether Marlin has opened a new product category or a niche.
Looking six months out, the critical question is whether Marlin's accuracy on real enterprise strategy questions is high enough to withstand the scrutiny that comes with high-stakes corporate decisions. Consumer deep-research tools have a forgiving error margin because the stakes of a wrong answer are low. An 8-hour Marlin session that produces a flawed market sizing estimate or a misidentified competitive threat, delivered to a CFO as a board-ready document, carries consequences that a consumer product never faces. Sakana AI will need to demonstrate not just that Marlin produces impressive-looking outputs but that those outputs are reliably correct at the level of precision that enterprise decisions require. That quality bar is far higher than any consumer AI product has yet been held to, and it will determine whether Marlin's commercial launch becomes a category-defining moment or a cautionary tale about AI products that overpromised in the enterprise.
Marlin doesn't promise to make your strategy team faster. It promises to make the question of whether you need a strategy team worth asking again.
Key Takeaways
- Sakana AI launched Marlin commercially on June 15, 2026, making it the first enterprise-grade autonomous research agent to run unattended sessions of up to eight hours
- AB-MCTS (Adaptive Branching Monte Carlo Tree Search), the same class of algorithm that powered AlphaGo, drives Marlin's research reasoning and received a NeurIPS 2025 Spotlight Award
- 300 enterprise professionals from financial institutions, consulting firms, and think tanks participated in the closed beta that ran from April through early June 2026
- Output: 100+ page reports and executive slide decks, produced autonomously without human intervention following a single initial prompt
- Sakana AI is backed by $379 million in funding at a $2.65 billion valuation, co-founded by former Google Brain researchers including Transformer paper co-author Llion Jones
Questions Worth Asking
- If an 8-hour Marlin session produces a strategy recommendation that a company acts on and the decision fails, who carries the accountability, and does the absence of a named human expert fundamentally limit Marlin's role in high-stakes board decisions?
- The consulting industry's real value may not be the research itself but the change management and implementation relationships that surround it. If Marlin automates research but not delivery, does it disrupt the industry or merely accelerate the trend of consultants doing less research and more change management?
- AB-MCTS explores a research tree by branching and pruning based on assessed value. What happens when the information landscape itself contains systematic biases or misinformation that lead Marlin to prune the most important branches in a complex geopolitical or macroeconomic analysis?