> ## Documentation Index
> Fetch the complete documentation index at: https://mintlify.com/primeintellect-ai/verifiers/llms.txt
> Use this file to discover all available pages before exploring further.

# MCP Integration

> Integrate Model Context Protocol servers as tools in Verifiers environments

The `MCPEnv` integration allows you to connect to MCP (Model Context Protocol) servers and expose their tools to language models in Verifiers environments.

MCP provides a standardized way to connect AI models to external data sources and tools via a simple protocol.

## Features

* **Multiple MCP servers** - Connect to multiple servers simultaneously
* **Automatic tool discovery** - Tools from servers are automatically exposed to models
* **stdio transport** - Communicates via standard input/output
* **Type-safe** - Preserves tool schemas and parameter types
* **Built on ToolEnv** - Inherits all ToolEnv features

## Installation

MCP support is included in core Verifiers:

```bash theme={null}
uv add verifiers
```

The MCP SDK is automatically installed as a dependency.

## Quick Start

<Steps>
  <Step title="Create an environment">
    Create a basic MCP environment:

    ```python theme={null}
    import os
    import verifiers as vf
    from verifiers.envs.experimental.mcp_env import MCPEnv
    from datasets import Dataset

    def load_environment():
        # Configure MCP servers
        mcp_servers = [
            {
                "name": "fetch",
                "command": "uvx",
                "args": ["mcp-server-fetch"],
                "description": "Fetch web content"
            },
        ]

        # Create dataset
        dataset = Dataset.from_dict({
            "question": [
                "What is the latest news on OpenAI's website?",
            ],
            "answer": ["Recent updates about GPT models"]
        })

        # Create rubric
        rubric = vf.JudgeRubric(judge_model="gpt-4.1-mini")
        
        async def judge_reward(judge, prompt, completion, answer):
            response = await judge(prompt, completion, answer)
            return 1.0 if "yes" in response.lower() else 0.0
        
        rubric.add_reward_func(judge_reward)

        # Create environment
        return MCPEnv(
            mcp_servers=mcp_servers,
            dataset=dataset,
            rubric=rubric,
            max_turns=10,
        )
    ```
  </Step>

  <Step title="Evaluate">
    Run an evaluation:

    ```bash theme={null}
    prime eval run my-mcp-env -m openai/gpt-4.1-mini -n 5
    ```
  </Step>
</Steps>

## MCP Server Configuration

Configure MCP servers using the `MCPServerConfig` format:

```python theme={null}
mcp_servers = [
    {
        "name": "fetch",
        "command": "uvx",
        "args": ["mcp-server-fetch"],
        "description": "Fetch web content",
    },
    {
        "name": "filesystem",
        "command": "npx",
        "args": ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"],
        "description": "File system access",
    },
]
```

**Configuration fields:**

* `name` - Identifier for the server
* `command` - Command to launch the server
* `args` - List of command arguments
* `env` - Environment variables (optional)
* `description` - Human-readable description (optional)

### With Environment Variables

For servers requiring API keys:

```python theme={null}
import os
import verifiers as vf

def load_environment():
    vf.ensure_keys(["EXA_API_KEY"])  # Validate key exists
    
    mcp_servers = [
        {
            "name": "exa",
            "command": "npx",
            "args": ["-y", "exa-mcp-server"],
            "env": {"EXA_API_KEY": os.environ["EXA_API_KEY"]},
            "description": "Exa search",
        },
    ]
    
    return MCPEnv(
        mcp_servers=mcp_servers,
        dataset=dataset,
        rubric=rubric,
    )
```

## Available MCP Servers

Common MCP servers you can use:

### Web & Search

**Fetch** - Retrieve web content

```python theme={null}
{
    "name": "fetch",
    "command": "uvx",
    "args": ["mcp-server-fetch"],
}
```

**Exa** - AI-powered search

```python theme={null}
{
    "name": "exa",
    "command": "npx",
    "args": ["-y", "exa-mcp-server"],
    "env": {"EXA_API_KEY": os.environ["EXA_API_KEY"]},
}
```

**Brave Search** - Web search

```python theme={null}
{
    "name": "brave",
    "command": "npx",
    "args": ["-y", "@modelcontextprotocol/server-brave-search"],
    "env": {"BRAVE_API_KEY": os.environ["BRAVE_API_KEY"]},
}
```

### File System

**Filesystem** - Read/write files

```python theme={null}
{
    "name": "filesystem",
    "command": "npx",
    "args": ["-y", "@modelcontextprotocol/server-filesystem", "/path/to/dir"],
}
```

### Databases

**PostgreSQL** - Query databases

```python theme={null}
{
    "name": "postgres",
    "command": "npx",
    "args": ["-y", "@modelcontextprotocol/server-postgres"],
    "env": {"POSTGRES_URL": os.environ["POSTGRES_URL"]},
}
```

**SQLite** - Local database access

```python theme={null}
{
    "name": "sqlite",
    "command": "npx",
    "args": ["-y", "@modelcontextprotocol/server-sqlite", "database.db"],
}
```

### Development Tools

**Git** - Repository operations

```python theme={null}
{
    "name": "git",
    "command": "npx",
    "args": ["-y", "@modelcontextprotocol/server-git"],
}
```

**GitHub** - GitHub API access

```python theme={null}
{
    "name": "github",
    "command": "npx",
    "args": ["-y", "@modelcontextprotocol/server-github"],
    "env": {"GITHUB_TOKEN": os.environ["GITHUB_TOKEN"]},
}
```

See [MCP servers directory](https://github.com/modelcontextprotocol/servers) for more servers.

## Full Example

Here's a complete example using multiple MCP servers:

```python theme={null}
import os
from datasets import Dataset
import verifiers as vf
from verifiers.envs.experimental.mcp_env import MCPEnv

def load_environment(
    mcp_servers: list | None = None,
    dataset=None,
    **kwargs
) -> vf.Environment:
    # Validate API keys
    vf.ensure_keys(["EXA_API_KEY"])
    
    # Configure MCP servers
    if mcp_servers is None:
        mcp_servers = [
            {
                "name": "exa",
                "command": "npx",
                "args": ["-y", "exa-mcp-server"],
                "env": {"EXA_API_KEY": os.environ["EXA_API_KEY"]},
                "description": "Exa AI search",
            },
            {
                "name": "fetch",
                "command": "uvx",
                "args": ["mcp-server-fetch"],
                "description": "Fetch web content",
            },
        ]
    
    # Create dataset
    if dataset is None:
        dataset = Dataset.from_dict({
            "question": [
                "Find the latest Prime Intellect announcement",
                "What is the current weather in San Francisco?",
            ],
            "answer": [
                "Information about recent announcements",
                "Current weather conditions",
            ]
        })
    
    # Create rubric with judge
    rubric = vf.JudgeRubric(judge_model="gpt-4.1-mini")
    
    async def judge_reward(judge, prompt, completion, answer, state):
        verdict = await judge(prompt, completion, answer, state)
        return 1.0 if "yes" in verdict.lower() else 0.0
    
    rubric.add_reward_func(judge_reward, weight=1.0)
    
    # Create MCP environment
    return MCPEnv(
        mcp_servers=mcp_servers,
        dataset=dataset,
        rubric=rubric,
        max_turns=10,
        **kwargs,
    )
```

## Error Handling

Configure error handling behavior:

```python theme={null}
def custom_error_formatter(error: Exception) -> str:
    """Format errors for the model."""
    return f"Tool error: {str(error)[:100]}"

env = MCPEnv(
    mcp_servers=mcp_servers,
    dataset=dataset,
    rubric=rubric,
    error_formatter=custom_error_formatter,
)
```

## Architecture Notes

<Note>
  `MCPEnv` is designed for **globally available, read-only MCP servers** where the same toolset can be shared across all rollouts. For servers requiring per-rollout state or mutable task-specific data, consider implementing a custom `StatefulToolEnv` subclass.
</Note>

### Connection Management

MCP servers are connected once during environment initialization and shared across all rollouts:

1. Environment starts background event loop
2. Connects to all configured MCP servers
3. Discovers available tools via `tools/list`
4. Exposes tools to rollouts
5. Cleanup on environment shutdown

### Tool Execution

When a model calls an MCP tool:

1. Tool call is intercepted by `MCPEnv`
2. Request is sent to appropriate MCP server
3. Response is returned as tool message
4. Errors are formatted via `error_formatter`

## Best Practices

* **Validate API keys** - Use `vf.ensure_keys()` to fail fast if keys are missing
* **Document requirements** - List required environment variables in README
* **Test servers locally** - Verify MCP servers work before using in environments
* **Handle errors gracefully** - Provide clear error messages via `error_formatter`
* **Limit tool calls** - Set reasonable `max_turns` to prevent infinite loops

## Limitations

* MCP servers must support stdio transport
* Servers are started once per environment, not per rollout
* No support for resources or prompts (tools only)
* Limited to read-only operations (no per-rollout state)

## Examples

See the [mcp-search-env](https://github.com/PrimeIntellect-ai/verifiers/tree/main/environments/mcp_search_env) example in the Verifiers repository for a complete implementation.

## Further Reading

* [MCP Specification](https://modelcontextprotocol.io/specification)
* [MCP Servers Repository](https://github.com/modelcontextprotocol/servers)
* [MCP Python SDK](https://github.com/modelcontextprotocol/python-sdk)
