SakanaAI’s AI-Scientist-v2: When Machines Start Doing Scientific Research

The landscape of scientific research is undergoing a profound transformation. SakanaAI, a Tokyo-based AI research company, has unveiled AI-Scientist-v2, a groundbreaking system capable of autonomously conducting scientific research – from generating hypotheses to running experiments, analyzing data, and producing publishable scientific manuscripts. Most remarkably, this system has already produced the first workshop paper written entirely by AI that was accepted through peer review.

The Evolution from AI Scientist-v1 to v2

The journey from AI Scientist-v1 to v2 represents a significant leap in autonomous research capabilities. While v1 relied heavily on human-authored templates and worked best for tasks with clear objectives and solid foundations, v2 removes these limitations entirely.

AI Scientist-v2 generalizes across machine learning domains without requiring template structures. It employs a progressive agentic tree search guided by an experiment manager agent, allowing it to explore research directions more freely and uncover insights that template-based systems might miss.

“The key innovation is the agentic tree search approach,” explains the SakanaAI team. “Rather than following a predetermined path, the system can dynamically explore multiple hypotheses, learn from experimental results, and redirect its investigation based on what it discovers.”

How the System Works

The AI Scientist-v2 operates through a sophisticated multi-stage pipeline:

Research Ideation: The system generates potential research ideas by analyzing existing literature and identifying gaps or opportunities for novel contributions. It uses Semantic Scholar API to check for novelty and ensures generated hypotheses haven’t already been explored.

Experimental Exploration: Once a hypothesis is selected, the system designs and runs experiments autonomously. Using agentic tree search, it explores different approaches, debugs failing ones, and progressively refines its methodology.

Data Analysis: After experiments complete, the system analyzes results using statistical methods appropriate to the research domain.

Paper Writing: Finally, it synthesizes findings into a coherent scientific manuscript, including related work comparisons, methodology descriptions, results presentation, and discussion of implications.

Technical Implementation

The system is designed to run on Linux systems with NVIDIA GPUs using CUDA and PyTorch. It supports multiple LLM backends including OpenAI models, Google Gemini through OpenAI API compatibility, and Claude models via Amazon Bedrock.

The default configuration uses Claude 3.5 Sonnet for experimentation phase, which the team reports typically costs around -20 per complete run. The subsequent writing phase adds approximately when using default models.

A typical full experiment run – covering ideation, experimentation, analysis, and paper generation – completes within several hours, though complex ideas may take longer.

Real-World Results and Acceptance

The proof of the system’s capability came when AI Scientist-v2 produced a complete workshop paper that was accepted through standard peer review at an ICLR 2025 workshop. This marked the first time an entirely AI-generated scientific manuscript passed human scholarly evaluation.

However, the team is transparent about limitations: “The AI Scientist-v2 doesn’t necessarily produce better papers than v1, especially when a strong starting template is available. v1 follows well-defined templates, leading to high success rates, while v2 takes a broader, more exploratory approach with lower success rates.”

Safety Considerations and Ethical Use

The SakanaAI team emphasizes the experimental nature of the system and includes strong warnings about potential risks. Since the codebase executes LLM-written code, running it within controlled sandbox environments – preferably Docker containers – is mandatory.

The system requires mandatory disclosure in any resulting publications: “This manuscript was autonomously generated using The AI Scientist.” This reflects the team’s commitment to transparency in AI-assisted or AI-generated research.

Implications for the Future of Science

AI Scientist-v2 represents a glimpse into a future where AI accelerates scientific discovery significantly. By automating routine aspects of research, human scientists could focus more on creative synthesis, ethical oversight, and guiding high-level research directions.

The system has particular relevance for accelerating research in machine learning itself – a meta-scientific capability that could lead to self-improving research systems. Early experiments suggest the approach can generate valid scientific insights that contribute meaningfully to the research community.

Looking Forward

SakanaAI continues to develop and refine the system, with plans to expand capabilities and improve success rates across more research domains. The team has made the code publicly available on GitHub for researchers interested in exploring automated scientific discovery.

As AI systems like AI Scientist-v2 mature, they promise to reshape not just how research is conducted, but potentially what kinds of scientific questions we can ask and answer. The era of AI as a genuine research partner – not just a tool – may be closer than we think.