Metadata-Version: 2.4
Name: synkro
Version: 0.5.49
Summary: Curate, evaluate, and ship LLM datasets from any document
Author: Murtaza Meerza
License-Expression: Apache-2.0
License-File: LICENSE
Keywords: dataset-generation,fine-tuning,llm,synthetic-data,training-data
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.10
Requires-Dist: aiosqlite>=0.19
Requires-Dist: beautifulsoup4>=4.12
Requires-Dist: html2text>=2020.1
Requires-Dist: httpx>=0.25
Requires-Dist: litellm>=1.40
Requires-Dist: mammoth>=1.6
Requires-Dist: pydantic>=2.0
Requires-Dist: pymupdf>=1.24
Requires-Dist: readchar>=4.0
Requires-Dist: rich>=13.0
Requires-Dist: sqlalchemy[asyncio]>=2.0
Requires-Dist: typer>=0.9
Provides-Extra: dev
Requires-Dist: pre-commit>=3.7; extra == 'dev'
Requires-Dist: pytest-asyncio>=0.21; extra == 'dev'
Requires-Dist: pytest>=7.0; extra == 'dev'
Requires-Dist: ruff>=0.1; extra == 'dev'
Provides-Extra: postgres
Requires-Dist: asyncpg>=0.29; extra == 'postgres'
Provides-Extra: verify
Requires-Dist: asyncpg>=0.29; extra == 'verify'
Requires-Dist: openai>=1.0; extra == 'verify'
Requires-Dist: pyyaml>=6.0; extra == 'verify'
Description-Content-Type: text/markdown

# Synkro

![](https://static.scarf.sh/a.png?x-pxid=f08f2a53-e0cf-4291-83f4-b518f620bf69)
[![PyPI version](https://img.shields.io/pypi/v/synkro.svg?cacheSeconds=3600)](https://pypi.org/project/synkro/)
[![Downloads](https://static.pepy.tech/badge/synkro)](https://pepy.tech/project/synkro)
[![Python 3.10+](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/downloads/)
[![Documentation](https://img.shields.io/badge/docs-synkro.sh-purple.svg)](https://synkro.sh/docs)
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)

## Installation

```bash
pip install synkro
```

## Quick Start

```python
from synkro import create_pipeline, DatasetType
from synkro.models import Google
from synkro.examples import EXPENSE_POLICY

pipeline = create_pipeline(
    model=Google.GEMINI_25_FLASH,
    grading_model=Google.GEMINI_25_PRO,
    dataset_type=DatasetType.CONVERSATION,
)

dataset = pipeline.generate(EXPENSE_POLICY, traces=50)
dataset.save("training.jsonl")
```

Or use the CLI:

```bash
synkro generate policy.pdf --traces 50

# Quick demo with built-in policy
synkro demo
```

## Features

- **Multiple dataset types** - Conversation, Instruction, Evaluation, Tool Calling
- **Auto grading & refinement** - Responses graded and refined until passing
- **Coverage tracking** - Track scenario diversity, identify gaps
- **Eval platform export** - LangSmith, Langfuse, Q&A formats
- **Any LLM** - OpenAI, Anthropic, Google, Ollama, vLLM
- **Any document** - PDF, DOCX, TXT, Markdown, URLs

## Documentation

Full documentation at **[synkro.sh/docs](https://synkro.sh/docs)**

- [Quickstart](https://synkro.sh/docs/quickstart)
- [Dataset Types](https://synkro.sh/docs/datasets/conversation)
- [Coverage Tracking](https://synkro.sh/docs/concepts/coverage)
- [Tool Calling](https://synkro.sh/docs/guides/tool-calling)
- [API Reference](https://synkro.sh/docs/api-reference/overview)
