Tag: Open Source

  • Hermes Agent: The Self-Improving AI Agent That Learns From Every Conversation

    Hermes Agent: The Self-Improving AI Agent That Learns From Every Conversation

    Artificial intelligence agents are everywhere these days, but most of them share a fundamental limitation: they don’t really learn from their experiences. You have the same conversation with them repeatedly, and they never get better. Nous Research aims to change that with Hermes Agent, a new open-source project that bills itself as “the agent that grows with you.”

    A Memory That Actually Remembers

    Traditional AI assistants treat every conversation as a clean slate. Hermes takes a fundamentally different approach. It maintains persistent memory across sessions, creating skills from experience and improving them during use. The agent nudges itself to retain knowledge, searches through past conversations, and builds a deepening model of who you are over time.

    “The only agent with a built-in learning loop,” as the project describes itself, goes beyond simple context windows. While conventional agents can only work with what you tell them in the current session, Hermes actively works to preserve and apply knowledge from previous interactions. That customer you mentioned last week? Hermes remembers. That preference you expressed months ago? It’s still there.

    Works Everywhere You Do

    One of Hermes’s standout features is its multi-platform support. You can interact with it through Telegram, Discord, Slack, WhatsApp, Signal, or traditional CLI鈥攁ll from a single gateway process. Voice memo transcription and cross-platform conversation continuity mean you can start a conversation on your phone and continue it on your desktop without missing a beat.

    The agent runs on a VPS, a GPU cluster, or serverless infrastructure that costs nearly nothing when idle. With Daytona and Modal, the agent’s environment hibernates when idle and wakes on demand. This means you get persistent assistance without persistent costs.

    Model Flexibility Without Lock-In

    Hermes doesn’t force you into a single AI provider. You can use Nous Portal, OpenRouter (with access to 200+ models), z.ai/GLM, Kimi/Moonshot, MiniMax, OpenAI, or your own endpoint. Switching models is as simple as running the model command鈥攏o code changes, no lock-in.

    This flexibility is particularly valuable for developers who want to experiment with different models for different tasks, or organizations that need to balance cost and performance across use cases.

    The Skills System

    Hermes includes a sophisticated skills system that allows the agent to create procedural memories and improve them autonomously. After completing complex tasks, the agent can create new skills that encapsulate what it learned. These skills then self-improve during subsequent use.

    The system uses FTS5 session search with LLM summarization for cross-session recall, and is compatible with the agentskills.io open standard. There’s also a Skills Hub where users can share and discover community-created skills.

    Research-Ready Architecture

    For AI researchers, Hermes offers batch trajectory generation, Atropos RL environments, and trajectory compression for training the next generation of tool-calling models. The project was built by Nous Research, the team behind several notable open-source AI projects.

    The installation process is straightforward鈥攔un a single curl command and you’re chatting with your new AI assistant in minutes. Windows users need WSL2, but Linux and macOS are supported natively.

    Migration from OpenClaw

    Interesting twist: Hermes can automatically import settings from OpenClaw, including persona files, memories, skills, API keys, and messaging configurations. If you’re already running an AI assistant setup, moving to Hermes is designed to be painless.

    With over 12,000 stars on GitHub, Hermes represents an interesting evolution in the AI agent space. Instead of just providing a static set of capabilities, it attempts to create a genuinely learning system鈥攐ne that gets better at helping you specifically, over time.

    The MIT-licensed project welcomes contributions and has an active Discord community for support and discussion. Whether you’re an individual looking for a more personal AI assistant or an enterprise exploring agentic workflows, Hermes offers a compelling combination of memory, flexibility, and self-improvement that sets it apart from the crowded agent space.

  • DeerFlow 2.0: ByteDance’s Open-Source SuperAgent Framework Takes GitHub by Storm

    DeerFlow 2.0: ByteDance’s Open-Source SuperAgent Framework Takes GitHub by Storm

    ByteDance, the Chinese tech giant best known for TikTok, has released what may be one of the most ambitious open-source AI agent frameworks to date: DeerFlow 2.0. Since its launch, the project has accumulated over 42,000 stars on GitHub, with more than 4,300 stars earned in a single day — a growth trajectory that has the entire machine learning community buzzing.

    DeerFlow 2.0 is described as an “open-source SuperAgent harness.” But what does that actually mean? In practical terms, it’s a framework that orchestrates multiple AI sub-agents working together in sandboxes to autonomously complete complex, multi-hour tasks — from deep research reports to functional web pages to AI-generated videos.

    From Deep Research to Full-Stack Super Agent

    The original DeerFlow launched in May 2025 as a focused deep-research framework. Version 2.0 is a ground-up rewrite on LangGraph 1.0 and LangChain that shares no code with its predecessor. ByteDance explicitly framed the release as a transition “from a Deep Research agent into a full-stack Super Agent.”

    The key architectural difference is that DeerFlow is not just a thin wrapper around a large language model. While many AI tools give a model access to a search API and call it an agent, DeerFlow 2.0 gives its agents an actual isolated computer environment: a Docker sandbox with a persistent, mountable filesystem.

    The system maintains both short- and long-term memory that builds user profiles across sessions. It loads modular “skills” — discrete workflows — on demand to keep context windows manageable. And when a task is too large for one agent, a lead agent decomposes it, spawns parallel sub-agents with isolated contexts, executes code and bash commands safely, and synthesizes the results into a finished deliverable.

    Key Features That Set DeerFlow 2.0 Apart

    DeerFlow 2.0 ships with a remarkable set of capabilities:

    • Docker-based AIO Sandbox: Every agent runs inside an isolated container with its own browser, shell, and persistent filesystem. This ensures that the agent’s operations remain strictly contained, even when executing bash commands or manipulating files.
    • Model-Agnostic Design: The framework works with any OpenAI-compatible API. While many users opt for cloud-based inference via OpenAI or Anthropic APIs, DeerFlow supports fully localized setups through Ollama, making it ideal for organizations with strict data sovereignty requirements.
    • Progressive Skill Loading: Modular skills are loaded on demand to keep context windows manageable, allowing the system to handle long-horizon tasks without performance degradation.
    • Kubernetes Support: For enterprise deployments, DeerFlow supports distributed execution across a private Kubernetes cluster.
    • IM Channel Integration: The framework can connect to external messaging platforms like Slack or Telegram without requiring a public IP.

    Real-World Capabilities

    Demos on the project’s official website (deerflow.tech) showcase real outputs: agent-generated trend forecast reports, videos generated from literary prompts, comics explaining machine learning concepts, data analysis notebooks, and podcast summaries. The framework is designed for tasks that take minutes to hours to complete — the kind of work that currently requires a human analyst or a paid subscription to a specialized AI service.

    ByteDance specifically recommends using Doubao-Seed-2.0-Code, DeepSeek v3.2, and Kimi 2.5 to run DeerFlow, though the model-agnostic design means enterprises aren’t locked into any particular provider.

    Enterprise Readiness and the Safety Question

    One of the most pressing questions for enterprise adoption is safety and readiness. While the MIT license is enterprise-friendly, organizations need to evaluate whether DeerFlow 2.0 is production-ready for their specific use cases. The Docker sandbox provides functional isolation, but organizations with strict compliance requirements should carefully evaluate the deployment architecture.

    ByteDance offers a bifurcated deployment strategy: the core harness can run directly on a local machine, across a private Kubernetes cluster, or connect to external messaging platforms — all without requiring a public IP. This flexibility allows organizations to tailor the system to their specific security posture.

    The Open Source AI Agent Race

    DeerFlow 2.0 enters an increasingly crowded field. Its approach of combining sandboxed execution, memory management, and multi-agent orchestration is similar to what NanoClaw (an OpenClaw variant) is pursuing with its Docker-based enterprise sandbox offering. But DeerFlow’s permissive MIT license and the backing of a major tech company give it a unique position in the market.

    The framework’s rapid adoption — over 39,000 stars within a month of launch and 4,600 forks — signals strong community interest in production-grade open-source agent frameworks. For developers and enterprises looking to build sophisticated AI workflows without vendor lock-in, DeerFlow 2.0 is definitely worth watching.

    The project is available now on GitHub under the MIT License.

  • Nvidia’s Nemotron-Cascade 2: How a 3B Parameter Model Wins Gold Medals in Math and Coding

    Nvidia’s Nemotron-Cascade 2: How a 3B Parameter Model Wins Gold Medals in Math and Coding

    The prevailing assumption in AI development has been straightforward: larger models trained on more data produce better results. Nvidia’s latest release directly challenges that orthodoxy鈥攁nd the training recipe behind it may matter more to enterprise AI teams than the model itself.

    Nemotron-Cascade 2 is an open-weight 30B Mixture-of-Experts model that activates only 3B parameters at inference time. Despite this compact footprint, it achieved gold medal-level performance on three of the world’s most demanding competitions: the 2025 International Mathematical Olympiad, the International Olympiad in Informatics, and the ICPC World Finals. It is only the second open model to reach this tier, after DeepSeek-V3.2-Speciale鈥攁 model with 20 times more parameters.

    Nvidia Nemotron-Cascade 2 Performance

    The Post-Training Revolution

    Pre-training a large language model from scratch is enormously expensive鈥攐n the order of tens to possibly hundreds of millions of dollars for frontier models. Nemotron-Cascade 2 starts from the same base model as Nvidia’s existing Nemotron-3-Nano鈥攜et it outperforms that model on nearly every benchmark, often surpassing Nvidia’s own Nemotron-3-Super, a model with four times the active parameters.

    The difference is entirely in the post-training recipe. This is the strategic insight for enterprise teams: you don’t necessarily need a bigger or more expensive base model. You may need a better training pipeline on top of the one you already have.

    Cascade RL: Sequential Domain Training

    Reinforcement learning has become the dominant technique for teaching LLMs to reason. The challenge is that training a model on multiple domains simultaneously鈥攎ath, code, instruction-following, agentic tasks鈥攐ften causes interference. Improving performance in one domain degrades it in another, a phenomenon known as catastrophic forgetting.

    Cascade RL addresses this by training RL stages sequentially, one domain at a time, rather than mixing everything together. Nemotron-Cascade 2 follows a specific ordering: first instruction-following RL, then multi-domain RL, then on-policy distillation, then RLHF for human preference alignment, then long-context RL, then code RL, and finally software engineering RL.

    MOPD: Reusing Your Own Training Checkpoints

    Even with careful sequential ordering, some performance drift is inevitable as the model passes through many RL stages. Nvidia’s solution is Multi-Domain On-Policy Distillation鈥攁 technique that selects the best intermediate checkpoint for each domain and uses it as a “teacher” to distill knowledge back into the student model.

    Critically, these teachers come from the same training run, sharing the same tokenizer and architecture. This eliminates distribution mismatch problems that arise when distilling from a completely different model family. According to Nvidia’s technical report, MOPD recovered teacher-level performance within 30 optimization steps on the AIME 2025 math benchmark, while standard GRPO required more steps to achieve a lower score.

    What Enterprise Teams Can Apply

    Several design patterns from this work are directly applicable to enterprise post-training efforts. The sequential domain ordering in Cascade RL means teams can add new capabilities without rebuilding the entire pipeline鈥攁 critical property for organizations that need to iterate quickly. MOPD’s approach of using intermediate checkpoints as domain-specific teachers eliminates the need for expensive external teacher models.

    Nemotron-Cascade 2 is part of a broader trend toward “intelligence density”鈥攅xtracting maximum capability per active parameter. For enterprise deployment, this matters enormously. A model with 3B active parameters can be served at a fraction of the cost and latency of a dense 70B model. Nvidia’s results suggest that post-training techniques can close the performance gap on targeted domains, giving organizations a path to deploy strong reasoning capabilities without frontier-level infrastructure costs.

    For teams building systems that need deep reasoning on structured problems鈥攆inancial modeling, scientific computing, software engineering, compliance analysis鈥擭vidia’s technical report offers one of the more detailed post-training methodologies published to date. The model and its training recipe are now available for download, giving enterprise AI teams a concrete foundation for building domain-specific reasoning systems without starting from scratch.

  • DeerFlow 2.0: ByteDance’s Open-Source SuperAgent That Could Redefine Enterprise AI

    DeerFlow 2.0: ByteDance’s Open-Source SuperAgent That Could Redefine Enterprise AI

    The AI agent landscape shifted dramatically this week with the viral explosion of DeerFlow 2.0, ByteDance’s ambitious open-source framework that transforms language models into fully autonomous “SuperAgents” capable of handling complex, multi-hour tasks from deep research to code generation. With over 39,000 GitHub stars and 4,600 forks in just weeks, this MIT-licensed framework is being hailed by developers as a paradigm shift in AI agent architecture.

    What Makes DeerFlow 2.0 Different

    Unlike typical AI tools that merely wrap a language model with a search API, DeerFlow 2.0 provides agents with their own isolated Docker-based computer environment鈥攁 complete sandbox with filesystem access, persistent storage, and a dedicated shell and browser. This “computer-in-a-box” approach means agents can execute bash commands, manipulate files, run code, and perform data analysis without risking damage to the host system.

    DeerFlow GitHub Repository

    The framework maintains both short-term and long-term memory that builds comprehensive user profiles across sessions. It loads modular “skills”鈥攄iscrete workflows鈥攐n demand to keep context windows manageable. When a task proves too large for a single agent, the lead agent decomposes it, spawns parallel sub-agents with isolated contexts, executes code safely, and synthesizes results into polished deliverables.

    From Deep Research to Full-Stack Super Agent

    DeerFlow’s original v1 launched in May 2025 as a focused deep-research framework. Version 2.0 represents a ground-up rewrite built on LangGraph 1.0 and LangChain, sharing no code with its predecessor. ByteDance explicitly framed the release as a transition “from a Deep Research agent into a full-stack Super Agent.”

    DeerFlow Architecture Overview

    New capabilities include a batteries-included runtime with filesystem access, sandboxed execution, persistent memory, and sub-agent spawning; progressive skill loading; Kubernetes support for distributed execution; and long-horizon task management that runs autonomously across extended timeframes.

    The framework is fully model-agnostic, working with any OpenAI-compatible API. It has strong out-of-the-box support for ByteDance’s own Doubao-Seed models, DeepSeek v3.2, Kimi 2.5, Anthropic’s Claude, OpenAI’s GPT variants, and local models run via Ollama. It also integrates with Claude Code for terminal-based tasks and connects to messaging platforms including Slack, Telegram, and Feishu.

    Why It’s Going Viral

    The project’s current viral moment results from a slow build that accelerated sharply after deeplearning.ai’s The Batch covered it, followed by influential posts on social media. After intensive personal testing, AI commentator Brian Roemmele declared that “DeerFlow 2.0 absolutely smokes anything we’ve ever put through its paces” and called it a “paradigm shift,” adding that his company had dropped competing frameworks entirely in favor of running DeerFlow locally.

    One widely-shared post framed the business implications bluntly: “MIT licensed AI employees are the death knell for every agent startup trying to sell seat-based subscriptions. The West is arguing over pricing while China just commoditized the entire workforce.”

    The ByteDance Question

    ByteDance’s involvement introduces complexity. The MIT-licensed, fully auditable code allows developers to inspect exactly what it does, where data flows, and what it sends to external services鈥攎aterially different from using a closed ByteDance consumer product. However, ByteDance operates under Chinese law, and for organizations in regulated industries like finance, healthcare, and defense, the provenance of software tooling triggers formal review requirements regardless of the code’s quality or openness.

    Strategic Implications for Enterprises

    The deeper significance of DeerFlow 2.0 may be less about the tool itself and more about what it represents: the race to define autonomous AI infrastructure and turn language models into something more like full employees capable of both communications and reliable actions.

    The MIT License positions DeerFlow 2.0 as a royalty-free alternative to proprietary agent platforms, potentially functioning as a cost ceiling for the entire category. Enterprises should favor adoption if they prioritize data sovereignty and auditability, as the framework supports fully local execution with models like DeepSeek or Kimi.

    As AI agents evolve from novelty demonstrations to production infrastructure, DeerFlow 2.0 represents a significant open-source contribution that enterprises can evaluate on technical merit鈥攑rovided they also consider the broader geopolitical context that now accompanies any software decision involving Chinese-origin technology.

  • Project N.O.M.A.D: The Offline Survival AI Computer That Works Without Internet

    When disaster strikes and the internet goes dark, most AI tools become useless. Project N.O.M.A.D is here to change that.

    Project N.O.M.A.D (stands for Nomadic Offline Machine for Autonomous Defense and Discovery) is an open-source, self-contained offline survival computer that packs critical tools, knowledge, and AI capabilities into a single portable device — one that works entirely without internet connectivity.

    Built with TypeScript and hosted on GitHub at Crosstalk-Solutions/project-nomad, the project has already garnered over 14,200 stars with an extraordinary 4,100+ stars in a single day — a sign of genuine viral demand that reflects real-world need.

    What Is Project N.O.M.A.D?

    Unlike typical web-based AI applications, Project N.O.M.A.D runs entirely on local hardware. It requires zero network connection to function, making it uniquely valuable in emergency scenarios. The project combines several survival-critical capabilities:

    • Local AI inference engine — offline question answering using pre-downloaded models
    • Pre-loaded knowledge databases covering first aid, navigation, weather prediction, and wilderness survival
    • Communication tools that work over radio frequencies or mesh networks independent of cellular infrastructure
    • Resource management modules for tracking food, water, supplies, and medical inventory
    • Emergency signal beacons and GPS-independent navigation for disoriented users

    Why It Matters

    Traditional AI assistants like ChatGPT or Claude require an active internet connection. In emergency scenarios — natural disasters, wilderness survival situations, remote fieldwork, or grid-down events — this dependency becomes life-threatening. Project N.O.M.A.D eliminates that single point of failure entirely.

    The project is notably built with contributions from AI-assisted workflows (credits include what appears to be Claude-assisted development), suggesting the project was designed with AI-native development principles from the ground up.

    Technical Highlights

    The system is built with TypeScript, making it accessible to a wide range of developers. Key technical features include:

    • Modular skill packs — users can add capabilities based on specific mission requirements
    • Cross-platform compatibility — runs on laptops, Raspberry Pi clusters, or dedicated survival hardware
    • Extensible knowledge graphs — users can customize for their specific geographic or operational context

    The GitHub repository’s rapid star growth (4,138 stars today alone) reflects a genuine appetite for AI that does not betray you when you need it most. In an era of increasing climate-related disasters and growing interest in self-sufficiency, Project N.O.M.A.D represents a compelling intersection of open-source software and practical survivalism.

    The Bigger Picture

    This project signals a broader trend: AI systems designed for degraded or absent infrastructure. While most of the AI industry chases cloud-based performance metrics, a counter-movement is building AI tools that prioritize resilience over raw capability.

    For developers, Project N.O.M.A.D offers an interesting architecture to study — how do you build an AI pipeline that delivers meaningful results with no external API calls, no cloud retrieval, and no streaming responses? The answers this project develops could influence edge AI deployment for years to come.

    Get involved: The project is fully open source and welcomes contributors. Whether you are interested in expanding its knowledge base, improving its offline models, or building dedicated hardware enclosures, the GitHub repository is the place to start.

    Project N.O.M.A.D on GitHub trending