> ## Documentation Index
> Fetch the complete documentation index at: https://mintlify.com/primeintellect-ai/verifiers/llms.txt
> Use this file to discover all available pages before exploring further.

# ReasoningGymEnv

> Wrapper for Reasoning Gym procedural reasoning datasets

# ReasoningGymEnv

Wrapper environment for [Reasoning Gym](https://github.com/reasoning-gym/reasoning-gym) procedural reasoning tasks.

## Overview

`ReasoningGymEnv` wraps Reasoning Gym datasets for use in Verifiers. It supports both single datasets and composite mixtures, automatically handles scoring via Reasoning Gym's built-in evaluators, and provides procedurally generated tasks.

**Key features:**

* Procedural task generation via seeds
* Support for all Reasoning Gym datasets
* Composite dataset mixing with custom weights
* Automatic task-specific scoring
* Built-in XML parser for structured responses

## Installation

Install with Reasoning Gym support:

```bash theme={null}
uv add 'verifiers[rg]'
```

Or when developing in the verifiers repo:

```bash theme={null}
uv sync --extra rg
```

See the [Reasoning Gym integration guide](/integrations/reasoning-gym) for setup details.

## Inheritance

```
Environment
└── MultiTurnEnv
    └── SingleTurnEnv
        └── ReasoningGymEnv
```

## Constructor

```python theme={null}
ReasoningGymEnv(
    gym: str | List[str | dict],
    num_train_examples: int = 1000,
    num_eval_examples: int = 100,
    system_prompt: str = DEFAULT_SYSTEM_PROMPT,
    parser: vf.Parser | None = None,
    seed: int = 0,
)
```

### Parameters

<ParamField path="gym" type="str | List[str | dict]">
  Dataset specification. Can be:

  * String: Single dataset name (e.g., `"arc_1d"`)
  * List of strings: Multiple datasets with equal weights
  * List of dicts: Datasets with custom weights and configs using `DatasetSpec` format
</ParamField>

<ParamField path="num_train_examples" type="int" default="1000">
  Number of training examples to generate.
</ParamField>

<ParamField path="num_eval_examples" type="int" default="100">
  Number of evaluation examples to generate.
</ParamField>

<ParamField path="system_prompt" type="str" default="DEFAULT_SYSTEM_PROMPT">
  System prompt for the model. Defaults to Reasoning Gym's default prompt.
</ParamField>

<ParamField path="parser" type="vf.Parser | None" default="None">
  Parser for model responses. If None, uses `XMLParser(fields=["answer"])`.
</ParamField>

<ParamField path="seed" type="int" default="0">
  Random seed for procedural generation.
</ParamField>

## Key Methods

### build\_rg\_dataset

```python theme={null}
def build_rg_dataset(
    gym: str | List[str | dict],
    total_examples: int = 1000,
    seed: int = 0
) -> ProceduralDataset
```

Construct a Reasoning Gym dataset from the specification.

**Handles three formats:**

1. String: Single dataset → `rg.create_dataset(gym, size=total_examples, seed=seed)`
2. List of strings: Multiple datasets with equal weights (1.0 each)
3. List of dicts: Datasets with custom `DatasetSpec` configurations

### rg\_to\_hf

```python theme={null}
def rg_to_hf(
    rg_dataset: ProceduralDataset
) -> Tuple[Dataset, Dataset]
```

Convert Reasoning Gym dataset to HuggingFace datasets for train and eval.

**Dataset format:**

* `question`: Task prompt from Reasoning Gym
* `answer`: Index as string (used to retrieve entry for scoring)
* `task`: Source dataset name from metadata

## Scoring

ReasoningGymEnv automatically creates a rubric with a custom reward function:

```python theme={null}
async def check_answer_reward_func(
    completion: vf.Messages,
    answer: str,
    **kwargs
) -> float:
    # Get entry by index
    entry = self.rg_dataset[int(answer)]
    # Parse model response
    response = str(parser.parse_answer(completion)).strip()
    # Score using Reasoning Gym's scoring
    reward = self.rg_dataset.score_answer(answer=response, entry=entry)
    return reward
```

The reward function uses Reasoning Gym's task-specific scoring, which varies by dataset (exact match, fuzzy match, numeric tolerance, etc.).

## Example Usage

### Single Dataset

```python theme={null}
import verifiers as vf
from verifiers.envs.integrations.reasoninggym_env import ReasoningGymEnv

def load_environment():
    return ReasoningGymEnv(
        gym="arc_1d",
        num_train_examples=1000,
        num_eval_examples=100,
        seed=0,
    )
```

### Multiple Datasets (Equal Weights)

```python theme={null}
import verifiers as vf
from verifiers.envs.integrations.reasoninggym_env import ReasoningGymEnv

def load_environment():
    return ReasoningGymEnv(
        gym=["arc_1d", "gsm8k", "math_count"],
        num_train_examples=3000,  # 1000 per dataset
        num_eval_examples=300,     # 100 per dataset
        seed=42,
    )
```

### Composite Dataset (Custom Weights)

```python theme={null}
import verifiers as vf
from verifiers.envs.integrations.reasoninggym_env import ReasoningGymEnv

def load_environment():
    # Use dict format for DatasetSpec
    return ReasoningGymEnv(
        gym=[
            {"name": "arc_1d", "weight": 2.0, "config": {}},
            {"name": "gsm8k", "weight": 1.0, "config": {}},
            {"name": "math_count", "weight": 0.5, "config": {}},
        ],
        num_train_examples=1000,
        num_eval_examples=100,
        seed=0,
    )
```

### Custom Parser

```python theme={null}
import verifiers as vf
from verifiers.envs.integrations.reasoninggym_env import ReasoningGymEnv

def load_environment():
    # Use custom parser for chain-of-thought
    parser = vf.XMLParser(
        fields=["reasoning", "answer"],
        answer_field="answer"
    )
    
    return ReasoningGymEnv(
        gym="gsm8k",
        parser=parser,
        system_prompt="Solve the math problem step by step. Use <reasoning> for your work and <answer> for the final numerical answer.",
        num_train_examples=1000,
    )
```

### With Custom System Prompt

```python theme={null}
import verifiers as vf
from verifiers.envs.integrations.reasoninggym_env import ReasoningGymEnv
from reasoning_gym.utils import SYSTEM_PROMPTS

def load_environment():
    # Use Reasoning Gym's CoT prompt
    return ReasoningGymEnv(
        gym="arc_1d",
        system_prompt=SYSTEM_PROMPTS["cot"],
        num_train_examples=1000,
    )
```

## Available Datasets

Reasoning Gym provides many procedural datasets. Some popular ones:

* **Pattern Recognition:** `arc_1d`, `arc_2d`
* **Math:** `gsm8k`, `math_count`, `number_theory`
* **Logic:** `boolean_logic`, `propositional_logic`
* **Sequences:** `sequence_next`, `sequence_missing`
* **Spatial:** `grid_navigation`, `spatial_reasoning`

Check the [Reasoning Gym repository](https://github.com/reasoning-gym/reasoning-gym) for the complete list.

## DatasetSpec Format

When using composite datasets with custom weights, use this format:

```python theme={null}
{
    "name": str,        # Dataset name (e.g., "arc_1d")
    "weight": float,    # Sampling weight (default 1.0)
    "config": dict      # Dataset-specific config (default {})
}
```

## Procedural Generation

All tasks are procedurally generated using seeds:

* Each example gets a unique seed: `seed + index`
* Same seed always generates the same task
* Infinite variations possible
* Reproducible across runs

## See Also

* [Reasoning Gym Integration Guide](/integrations/reasoning-gym) - Setup and dataset details
* [SingleTurnEnv](/api/single-turn-env) - Base class documentation
* [XMLParser](/api/xml-parser) - Parser for structured responses
* [Rubric](/api/rubric) - Reward function configuration
