Metadata-Version: 2.4
Name: rnow
Version: 0.4.12
Summary: ReinforceNow CLI - Reinforcement Learning platform command-line interface
Requires-Python: <3.15,>=3.11
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: click>=8.0.0
Requires-Dist: requests>=2.25.0
Requires-Dist: httpx>=0.24.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: pyyaml>=5.4.0
Requires-Dist: packaging>=21.0
Requires-Dist: prompt_toolkit>=3.0.0
Requires-Dist: rich>=13.0.0
Requires-Dist: python-dotenv>=1.0.0
Requires-Dist: tokenizers>=0.15.0
Requires-Dist: openai-harmony>=0.0.8
Requires-Dist: boto3==1.34.100
Requires-Dist: datasets
Provides-Extra: test
Requires-Dist: tinker-cookbook>=0.1.0; extra == "test"
Requires-Dist: transformers>=4.40.0; extra == "test"
Provides-Extra: api
Requires-Dist: fastapi>=0.68.0; extra == "api"
Requires-Dist: uvicorn>=0.15.0; extra == "api"
Provides-Extra: mcp
Requires-Dist: fastmcp>=0.1.0; extra == "mcp"
Provides-Extra: all
Requires-Dist: tinker-cookbook>=0.1.0; extra == "all"
Requires-Dist: transformers>=4.40.0; extra == "all"
Requires-Dist: fastapi>=0.68.0; extra == "all"
Requires-Dist: uvicorn>=0.15.0; extra == "all"
Requires-Dist: fastmcp>=0.1.0; extra == "all"
Dynamic: license-file

<div align="center">
  <img
    alt="ReinforceNow CLI"
    src="./assets/header.png"
    width="100%"
  >
  <br><br>

[![PyPI version](https://img.shields.io/pypi/v/rnow?color=blue)](https://pypi.org/project/rnow/)
[![Docs](https://img.shields.io/badge/docs-reinforcenow.ai-blue)](https://reinforcenow.ai/docs)
[![Follow on X](https://img.shields.io/badge/Follow_on_X-@reinforcenow-black?labelColor=white)](https://x.com/reinforcenow)
[![MIT License](https://img.shields.io/badge/license-MIT-green)](./LICENSE)

</div>

# Documentation

See the [documentation](https://www.reinforcenow.ai/docs/getting-started/quickstart) for a technical overview of the platform and [train your first agent](https://www.reinforcenow.ai/docs/getting-started/first-agent)

# Quick Start

### 1. Install uv (Python package manager)

```bash
# macOS/Linux:
$ curl -LsSf https://astral.sh/uv/install.sh | sh

# Windows:
PS> powershell -c "irm https://astral.sh/uv/install.ps1 | iex"
```

### 2. Install ReinforceNow

```bash
uv init && uv venv --python 3.11
source .venv/bin/activate  # Windows: .\.venv\Scripts\Activate.ps1
uv pip install rnow
```

### 3. Authenticate

```bash
rnow login
```

### 4. Create & Run Your First Project

```bash
rnow init --template sft
rnow run
```

That's it! Your training run will start on ReinforceNow's infrastructure. Monitor progress in the [dashboard](https://reinforcenow.ai/home).

![ReinforceNow Graph](./assets/reinforcenow-graph.png)

# Core Concepts

Go from raw data to a reliable AI agent in production. ReinforceNow gives you the flexibility to define:

### 1. Reward Functions

Define how your model should be evaluated using the `@reward` decorator:

```python
from rnow.core import reward, RewardArgs

@reward
async def accuracy(args: RewardArgs, messages: list) -> float:
    """Check if the model's answer matches ground truth."""
    response = messages[-1]["content"]
    expected = args.metadata["answer"]
    return 1.0 if expected in response else 0.0
```

→ [Write your first reward function](https://www.reinforcenow.ai/docs/getting-started/first-reward)

### 2. Tools (for Agents)

Give your model the ability to call functions during training:

```python
from rnow.core import tool

@tool
def search(query: str, max_results: int = 5) -> dict:
    """Search the web for information."""
    # Your implementation here
    return {"results": [...]}
```

→ [Train an agent with custom tools](https://www.reinforcenow.ai/docs/getting-started/first-agent)

### 3. Training Data

Create a `train.jsonl` file with your prompts and reward assignments:

```json
{"messages": [{"role": "user", "content": "Balance the equation: Fe + O2 → Fe2O3"}], "rewards": ["accuracy"], "metadata": {"answer": "4Fe + 3O2 → 2Fe2O3"}}
{"messages": [{"role": "user", "content": "Balance the equation: H2 + O2 → H2O"}], "rewards": ["accuracy"], "metadata": {"answer": "2H2 + O2 → 2H2O"}}
{"messages": [{"role": "user", "content": "Balance the equation: N2 + H2 → NH3"}], "rewards": ["accuracy"], "metadata": {"answer": "N2 + 3H2 → 2NH3"}}
```

→ [Learn about training data format](https://www.reinforcenow.ai/docs/cli-reference/train-data)

# Contributing

We welcome contributions! ❤️ Please open an issue to discuss your ideas before submitting a PR

<br>
<div align="center">
  <img
    alt="ReinforceNow"
    src="./assets/footer.png"
    width="100%"
  >
</div>
