⚠ This page is served via a proxy. Original site: https://github.com
This service does not collect credentials or authentication data.
Skip to content

OpenBMB/AgentCPM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AgentCPM-Explore 标志

中文 | English】

Latest News

  • [2026-01-12] 🚀🚀🚀 We have open-sourced AgentCPM-Explore, the first open-source agent model with 4B parameters to appear on 8 widely used long-horizon agent benchmarks.

Overview

AgentCPM is a series of open-source large language model agents jointly developed by the THUNLP, Renmin University of China, ModelBest, and OpenBMB.

Model List

Model Download Links Resources Technical Report How to Use
AgentCPM-Explore 🤗 Hugging Face
🤖 ModelScope
AgentDock: An unified tool sandbox management and scheduling platform
AgentRL: An asynchronous agent reinforcement learning training framework
AgentToLeaP: An one-click evaluation platform for agent tool-learning capabilities
Coming Soon README.md

AgentCPM-Explore

The AgentCPM team has focused on systematically building agents’ deep research capabilities and released AgentCPM-Explore, a deep-search LLM agent. AgentCPM-Explore is the first open-source agent model with 4B parameters to appear on eight widely used long-horizon agent benchmarks such as GAIA, XBench, etc.

Key highlights:

  • SOTA at 4B Scale: Best-in-class among same-size models, matches or surpasses 8B models, rivals some 30B+ and closed-source LLMs.

  • Deep Exploration: 100+ turns of continuous interaction with multi-source cross-validation and dynamic strategy adjustment.

  • End-to-End Open Source: Complete training and evaluation infrastructure for community development and custom extensions.

Demo

Demo examples (speed up):

demo_en.mp4

QuickStart

  • Multi-model, multi-tool collaborative environment setup: First, start the AgentDock tool sandbox platform to provide unified MCP (Model Context Protocol) tool services. When working with API-based models, configure the model’s BASE_URL and API_KEY. When working with locally hosted models, ensure the model service is accessible. Configure the required tool parameters in the config.toml file.

  • Launch the environment: Out of the box, one-click startup. The AgentDock unified tool sandbox platform supports launching all services with a single docker compose up -d command, including the management dashboard, database, and tool nodes.

  • Run execution: Quickly experience the core capabilities of the framework via the QuickStart script, allowing you to run a complete Agent task without complex configuration.

  1. Prepare Evaluation Environment (Recommended):
    We provide a Docker image with all evaluation dependencies pre-installed. It is recommended to pull the image and run it directly:

    # 1. Enter the project folder
    cd AgentCPM-Explore
    
    # 2. Pull the image
    docker pull yuyangfu/agenttoleap-eval:v1.0
    
    # 3. Start the container (Adjust the -v path as needed)
    docker run -dit --name agenttoleap --gpus all --network host -v $(pwd):/workspace yuyangfu/agenttoleap-eval:v1.0
    
    # 4. Enter the container
    docker exec -it agenttoleap /bin/bash
    cd /workspace
  2. Configure and run:
    Open quickstart.py and make simple configurations in the [USER CONFIGURATION] section:

  • Custom task: Modify the QUERY variable to the instruction you want to test (e.g., “Check the results of last night’s UEFA Champions League matches”).
  • Model information: Provide your LLM API_KEY, MODEL_NAME, and BASE_URL.
  • Tool service: Set MANAGER_URL to the address of your MCP tool server (e.g., http://localhost:8000; make sure the service is already running).

After configuration, run:

python quickstart.py

The script will automatically create a demo task (by default, querying today’s arXiv computer science papers), generate the execution workflow, and start the evaluation process.

  1. View Results

After execution completes, results will be saved under the outputs/quickstart_results/ directory. You can inspect dialog.json to obtain the full interaction trace, including tool calls and reasoning chains.

Note: In QuickStart mode, automatic scoring is skipped by default and is intended only to demonstrate the Agent’s execution capabilities.

License

  • The code in this repository is released under the Apache-2.0 license.

Citation

If AgentCPM-Explore is useful for your research, please cite the codebase:

@software{AgentCPMExplore2026,
  title  = {AgentCPM-Explore: An End-to-End Infrastructure for Training and Evaluating LLM Agents},
  author = {Haotian Chen and Xin Cong and Shengda Fan and Yuyang Fu and Ziqin Gong and Yaxi Lu and Yishan Li and Boye Niu and Chengjun Pan and Zijun Song and Huadong Wang and Yesai Wu and Yueying Wu and Zihao Xie and Yukun Yan and Zhong Zhang and Yankai Lin and Zhiyuan Liu and Maosong Sun},
  year   = {2026},
  url    = {https://github.com/OpenBMB/AgentCPM}
}

Explore More

About

An End-to-End Infrastructure for Training and Evaluating Various LLM Agents

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 7