AIStackInsightsAIStackInsights
HomeBlogCategoriesAboutNewsletter
AIStackInsightsAIStackInsights

Practical AI insights — LLMs, machine learning, prompt engineering, and the tools shaping the future.

Content

  • All Posts
  • LLMs
  • Tutorials
  • AI Tools

Company

  • About
  • Newsletter
  • RSS Feed

Connect

© 2026 AIStackInsights. All rights reserved.

Tutorials

DeerFlow 2.0: ByteDance Open-Sourced a Full-Stack SuperAgent. Here's the Complete Developer Guide.

ByteDance's DeerFlow 2.0 hit #1 on GitHub Trending with 39K stars in weeks. It's not another chatbot wrapper — it's a full-stack SuperAgent harness with sandboxed execution, persistent memory, sub-agents, and LangGraph orchestration. Here's everything you need to build with it.

AIStackInsights TeamMarch 28, 202613 min read
ai-agentsopen-sourcelanggraphtutorialsdeveloper-toolsllms

On February 28, 2026, a repository from ByteDance — the company behind TikTok — claimed the #1 spot on GitHub Trending. Within weeks it had accumulated 39,000 stars and 4,600 forks. The machine learning community called it a paradigm shift. One developer dropped their entire existing agent stack to run it exclusively.

The project is DeerFlow 2.0. And unlike the wave of chatbot wrappers that flood GitHub every week, this one is different in ways that matter to developers who build serious systems.

This is the complete technical guide — architecture, setup, configuration, custom skills, and the hard questions about whether you should actually use it.

📁 Companion scripts for this article — setup automation, custom skill template, and API test harness: github.com/aistackinsights/stackinsights/deerflow-2-superagent-developer-guide

Includes: setup_deerflow.sh, custom_skill_template.py, test_deerflow_api.py


What DeerFlow 2.0 Actually Is

Most AI agent frameworks give a language model access to a search API and call it an agent. DeerFlow 2.0 gives its agents an actual isolated computer — a Docker sandbox with a persistent filesystem, a real browser, and a shell.

The distinction matters enormously in practice.

When an agent using a typical framework needs to write code, execute it, check the output, and iterate — it can't, because there's nowhere to run the code. When a DeerFlow agent needs to do the same, it opens a shell in the sandbox, writes the file, executes it, reads stdout, and adjusts. The entire development loop happens autonomously, in an isolated environment that cannot harm the host system.

The technical definition: DeerFlow is a SuperAgent harness that orchestrates sub-agents, memory, sandboxes, tools, and skills to complete long-horizon tasks that take minutes to hours — the kind of work that currently requires a human analyst, developer, or a paid subscription to a specialized SaaS product.

What it can do out of the box:

Task CategoryExamples
Deep researchIndustry trend reports, competitive analysis, technical literature reviews
Code generationFunctional web pages, scripts, data pipelines — written, executed, and debugged
Data analysisExploratory analysis with visualizations from raw datasets
Content creationReports, slide deck outlines, comic strip explainers, video scripts
Media processingPodcast summaries, video transcript analysis
Multi-step workflowsAny task requiring web search → code → execute → synthesize

DeerFlow v1 vs v2: Version 2.0 is a complete ground-up rewrite. It shares zero code with v1. The v1 branch (main-1.x) remains available and maintained for simple deep-research workflows, but all active development is on 2.0. Don't mix them up.


Architecture: The Six Pillars

Understanding DeerFlow's architecture is the fastest path to knowing what you can build on it. There are six core systems:

1. The Orchestration Layer (LangGraph 1.0)

DeerFlow 2.0 was rewritten on LangGraph 1.0, the graph-based orchestration framework from LangChain. Every agent workflow is modeled as a directed graph: nodes are agent steps, edges are conditional transitions based on agent output.

This matters because LangGraph gives you deterministic control over the agent loop. You can inspect state at every step, inject interrupts, resume failed runs, and trace the full decision graph. Compare this to typical "agent loops" where the model decides everything — LangGraph lets the developer define the structure while the model fills in the reasoning.

2. The AIO Sandbox

The Agent-In-One (AIO) Sandbox is a Docker container that runs alongside the main DeerFlow stack. It contains:

  • A headless Chromium browser (for web interaction)
  • A full shell environment (bash, Python, Node.js pre-installed)
  • A persistent, mountable filesystem (state survives across tasks)
  • Network isolation (outbound allowed, no inbound to host)
# The sandbox runs as a Docker service — check its status:
docker compose ps sandbox
 
# Exec into it for debugging:
docker compose exec sandbox bash
 
# Files written by agents appear here:
ls ./sandbox-data/

Critically: even if you're running DeerFlow with a local Ollama model — no cloud at all — the sandbox still runs. The agent's actions are always contained. This is the key architectural decision that separates DeerFlow from frameworks that let agents write arbitrary files to your host filesystem.

3. The Skills System

Skills are modular, loadable workflow plugins. Instead of stuffing all capability into one giant system prompt, DeerFlow loads skills on demand — only adding a skill's context to the agent when the task requires it.

# A DeerFlow skill is a LangChain BaseTool subclass
# Skills live in backend/src/skills/
# See companion repo for a full annotated template
 
from langchain_core.tools import BaseTool
from pydantic import BaseModel, Field
 
class MySkillInput(BaseModel):
    query: str = Field(description="The query to process")
    depth: int = Field(default=2, ge=1, le=3)
 
class MySkill(BaseTool):
    name: str = "my_skill"
    description: str = "Use this skill when you need to..."
    args_schema: type[BaseModel] = MySkillInput
 
    def _run(self, query: str, depth: int = 2) -> str:
        # Your logic: call APIs, execute code, process files
        return f"Result for: {query}"
 
    async def _arun(self, query: str, depth: int = 2) -> str:
        return self._run(query, depth)

The progressive skill loading system keeps context windows manageable — a critical design constraint for long-horizon tasks. A task requiring web research, code execution, and report generation loads exactly those three skills. A task that only needs data analysis doesn't pay the token cost of web browsing skills.

4. Sub-Agents

When a task is too large or complex for a single agent pass, the lead agent decomposes it:

User task: "Research AI agent frameworks, benchmark them, write a report"
         ↓
  Lead Agent decomposes into:
  ├── Sub-Agent A: "Research CrewAI — stars, features, benchmarks"
  ├── Sub-Agent B: "Research LangGraph — stars, features, benchmarks"
  ├── Sub-Agent C: "Research DeerFlow — stars, features, benchmarks"
  └── Sub-Agent D: "Research AutoGen — stars, features, benchmarks"
         ↓
  Each sub-agent runs in parallel with isolated context
         ↓
  Lead Agent synthesizes into final report

Each sub-agent gets its own context window — they don't share state or interfere with each other. The lead agent manages coordination and synthesis. This is the core mechanism that enables the "hours-long tasks" DeerFlow advertises.

5. Memory System

DeerFlow maintains two memory tiers:

TierScopeStorageUse case
Short-termWithin a taskIn-memory LangGraph stateCurrent task context, recent tool outputs
Long-termAcross sessionsPersistent vector storeUser preferences, prior research, learned facts

The long-term memory builds a user profile over time. If you ran a competitive analysis last week, DeerFlow can reference those findings in a related task today — without you re-providing context.

6. The Messaging Gateway

DeerFlow can connect to external messaging platforms as an interface — receiving tasks and delivering results without a web UI:

# In .env — enable one or more messaging channels
TELEGRAM_BOT_TOKEN=your_token
SLACK_BOT_TOKEN=xoxb-...
FEISHU_APP_ID=your_app_id

Once configured, you can send tasks to DeerFlow from Telegram and receive structured results back — no browser required. This is how you deploy DeerFlow as a background assistant rather than an interactive web app.


Quick Start: Running DeerFlow in 5 Minutes

Prerequisites: Docker Desktop (running), Git, Python 3.11+. DeerFlow's Docker stack requires ~4GB RAM for the default configuration. For local model inference via Ollama, add 8–20GB depending on model size.

Option A: Automated setup (recommended)

# Clone the companion scripts
git clone https://github.com/aistackinsights/stackinsights.git
cd stackinsights/deerflow-2-superagent-developer-guide
chmod +x setup_deerflow.sh
./setup_deerflow.sh

The script handles everything: cloning DeerFlow, creating .env, configuring the sandbox, and starting the Docker stack.

Option B: Manual setup

# 1. Clone DeerFlow
git clone --depth=1 https://github.com/bytedance/deer-flow.git
cd deer-flow
 
# 2. Configure your LLM provider
cp .env.example .env
# Edit .env — choose your backend (see below)
 
# 3. Start the stack
docker compose up -d
 
# 4. Open the web UI
open http://localhost:3000  # macOS
# Windows: start http://localhost:3000

LLM Configuration: Which Model to Use

DeerFlow is fully model-agnostic. It works with any OpenAI-compatible API.

SetupModelsConfigBest for
OpenAIgpt-4o + o3-miniOPENAI_API_KEY=sk-...Fastest setup, best reliability
Anthropicclaude-opus-4-6ANTHROPIC_API_KEY=sk-ant-...Best reasoning, highest quality output
DeepSeek (recommended)deepseek-v3.2DEEPSEEK_API_KEY=...Best price/performance, ByteDance's own pick
Kimikimi-2.5MOONSHOT_API_KEY=...Strong long-context tasks
Ollama (local/private)qwen2.5:32bOLLAMA_BASE_URL=http://localhost:11434Full privacy, no cloud costs
# .env — OpenAI example (simplest)
OPENAI_API_KEY=sk-your-key-here
BASIC_MODEL=gpt-4o          # used for routine steps
REASONING_MODEL=o3-mini     # used for complex planning
 
# .env — Full local setup (no cloud)
OLLAMA_BASE_URL=http://localhost:11434
BASIC_MODEL=ollama/qwen2.5:32b
REASONING_MODEL=ollama/deepseek-r1:32b
SANDBOX_ENABLED=true        # still sandboxed even with local models

ByteDance's own recommendation is Doubao-Seed-2.0-Code for coding tasks and DeepSeek v3.2 for research and reasoning. If you're outside China (where Doubao is available), DeepSeek v3.2 + OpenAI o3-mini is the strongest combination tested by the community.

On the ByteDance trust question: DeerFlow itself is MIT-licensed and its code is auditable on GitHub. The security surface is Docker: the AIO sandbox is isolated, and outbound traffic from the sandbox goes only where your agent tasks direct it. The concern — legitimate for enterprise environments — is less about DeerFlow specifically and more about organizational policy on open-source software from Chinese companies. Run it air-gapped with Ollama if data sovereignty is a requirement.


Testing Your Setup

Once DeerFlow is running, test it end-to-end:

# Health check
curl http://localhost:8000/health
 
# Submit a research task (non-streaming)
curl -X POST http://localhost:8000/api/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [{
      "role": "user",
      "content": "Research the top 5 open-source LLM inference frameworks. Compare: stars, license, hardware requirements, and best use case. Output a markdown table."
    }]
  }'
 
# Or use the companion test script (streaming, with rich output)
pip install httpx rich
python test_deerflow_api.py --task "Research the top AI agent frameworks of 2026"
 
# List available skills
python test_deerflow_api.py --list-skills

Building a Custom Skill

The skills system is where DeerFlow becomes extensible. Here's a minimal production-ready skill:

# backend/src/skills/stock_analyst.py
from langchain_core.tools import BaseTool
from pydantic import BaseModel, Field
import httpx
 
class StockAnalystInput(BaseModel):
    ticker: str = Field(description="Stock ticker symbol, e.g. NVDA, AAPL")
    days: int = Field(default=30, description="Number of days of history to analyze")
 
class StockAnalystSkill(BaseTool):
    name: str = "stock_analyst"
    description: str = (
        "Fetches and analyzes stock price data for a given ticker. "
        "Use when the user asks about stock performance, price trends, or technical analysis."
    )
    args_schema: type[BaseModel] = StockAnalystInput
 
    def _run(self, ticker: str, days: int = 30) -> str:
        # Fetch data, compute indicators, return markdown summary
        # Full implementation: see custom_skill_template.py in companion repo
        return f"[Stock analysis for {ticker} over {days} days]"
 
    async def _arun(self, ticker: str, days: int = 30) -> str:
        return self._run(ticker, days)
 
def get_skill():
    return StockAnalystSkill()

Then register in backend/src/skills/__init__.py:

from .stock_analyst import get_skill as get_stock_analyst
SKILL_REGISTRY["stock_analyst"] = get_skill

Restart the backend — your skill is now available to all agents.


DeerFlow vs. The Alternatives

FrameworkOrchestrationSandboxMemorySub-agentsLicenseStars
DeerFlow 2.0LangGraph 1.0✅ Docker AIO✅ Short + Long✅ ParallelMIT39K
CrewAICustom❌✅ Short✅ SequentialMIT28K
AutoGenGraph-based⚠️ Optional✅ Short✅MIT42K
LangGraphGraph native❌✅ Via plugins✅MIT12K
OpenDevinLangChain✅ Docker❌❌MIT33K

The unique combination that makes DeerFlow stand out: mandatory sandbox + LangGraph 1.0 + persistent long-term memory + progressive skill loading. No other framework has all four.

The trade-off: DeerFlow is opinionated. If you want maximum flexibility in how you structure agent graphs, raw LangGraph gives you more control. If you want a batteries-included system that works out of the box for long-horizon tasks, DeerFlow wins.


Deployment: Local, Docker, Kubernetes

# Local development (no Docker — for rapid iteration)
cd deer-flow
pip install -e backend/
make run-backend   # starts FastAPI at :8000
make run-frontend  # starts Next.js at :3000
 
# Docker Compose (recommended for most use cases)
docker compose up -d
docker compose logs -f        # stream logs
docker compose down           # stop
 
# Kubernetes (for enterprise / multi-user deployments)
# DeerFlow ships a Helm chart — see docs/deployment/kubernetes.md
helm install deerflow ./charts/deerflow \
  --set llm.provider=openai \
  --set llm.apiKey=$OPENAI_API_KEY \
  --set sandbox.enabled=true

For Kubernetes deployments, the AIO Sandbox runs as a sidecar container per agent pod, ensuring isolation scales horizontally with task volume.


Who Should Use DeerFlow 2.0

Use it if:

  • You need agents that execute code, not just generate it
  • Your tasks run for minutes or hours and need robust state management
  • You want a local/private option with full data sovereignty (Ollama backend)
  • You're building research, analysis, or content automation pipelines
  • You want the flexibility of MIT license for commercial products

Look elsewhere if:

  • You need maximum graph flexibility and customization → use raw LangGraph
  • Your use case is simple multi-turn chat → overkill; use the SDK directly
  • You have hard requirements against ByteDance-origin software in your org → OpenDevin or AutoGen may be more appropriate
  • You need Windows-native tooling (DeerFlow is Linux/macOS-first)

The benchmark for open-source agent frameworks has moved. DeerFlow 2.0 is not another demo. It's a production-grade system that happens to be free, forkable, and MIT-licensed — the kind of thing that was a $20K/year enterprise software contract 18 months ago.

Go build something with it.


Sources:

  • DeerFlow 2.0 — Official GitHub Repository — bytedance/deer-flow, MIT License
  • DeerFlow Official Website & Demos — real outputs: research reports, notebooks, generated videos
  • VentureBeat: What is DeerFlow 2.0 and what should enterprises know? — Carl Franzen, March 24, 2026
  • DeepLearning.AI The Batch: DeerFlow 2.0 puts new spin on Claw-like agents — community credibility article that drove viral growth
  • LangGraph 1.0 Documentation — the orchestration backbone of DeerFlow 2.0
  • LangChain Documentation — tools, memory, and skill system foundations
  • BytePlus InfoQuest — ByteDance's search/crawl toolset integrated into DeerFlow
  • GitHub Trendshift: DeerFlow repository stats — star growth trajectory data
  • Min Choi on X: DeerFlow 2.0 announcement post — 1,300+ likes, triggered viral spread March 21
  • DeerFlow Kubernetes deployment guide — enterprise deployment documentation
  • Ollama: Run LLMs locally — local model backend for air-gapped DeerFlow deployments
  • Docker Compose documentation — DeerFlow's recommended deployment method

Was this article helpful?

Share:

Related Posts

Tutorials

Claude Code Power User Guide: Every Command, Shortcut, and Hidden Feature

The complete Claude Code reference for 2026 — CLAUDE.md architecture, MCP wiring, worktrees, slash commands, and the workflows that 10x your output.

Read more
Large Language Models

AI Solved a Frontier Math Problem This Week. It Also Scored 1% on Tasks a Child Masters in Minutes.

ARC-AGI-3 just launched and current AI scores under 5%. The same week GPT-5.4 solved an open research math problem. This is not a contradiction. It is the most important insight about intelligence published this decade.

Read more
AI Tools

The $80 Brain: A Billion Tiny AI Agents Are About to Run on Everything You Own

AI is leaving the cloud. The next revolution isn't AGI — it's a billion cheap, autonomous agents running on the device in your hand, your wall, and your factory floor.

Read more

Comments

No comments yet. Be the first to share your thoughts!

Leave a comment

Weekly AI insights

Join developers getting LLM tips, ML guides, and tool reviews.

Ad Slot:

Sponsor this space

Reach thousands of AI engineers weekly.