Skip to main content
Toolathlon includes a total of 108 tasks across 7 categories, and the figure below shows the detailed distribution of these tasks. You can view detailed information about all tasks here.
task-dist

Data Structure

Each task in Toolathon is organized into a directory, which contains:
task/
├── preprocess/             # Set up the initial working state, e.g. remote source, mails (optional)
├── docs/                   # Task instruction
    ├── task.md
    └── agent_system_prompt.md
├── initial_workspace/      # Local initial workspace (optional)
├── groundtruth_workspace/  # Local groundtruth workspace (optional)
├── evaluation/             # Test whether the task has been completed
├── ...                     # Other resources required for some tasks (optional)
└── task_config.json        # Configure the tools required for the task
The task_config.json file contains three attributes:
  • needed_mcp_servers The MCP server that the agent may use to complete this task.
  • needed_local_tools The tooklits we implemented that the agent may use.
  • meta The meta information of this task (optional)
I