Metadata-Version: 2.4
Name: cocoindex
Version: 1.0.0a28
Classifier: Development Status :: 3 - Alpha
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Rust
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Programming Language :: Python :: Free Threading :: 2 - Beta
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Text Processing :: Indexing
Classifier: Intended Audience :: Developers
Classifier: Natural Language :: English
Classifier: Typing :: Typed
Requires-Dist: typing-extensions>=4.12
Requires-Dist: click>=8.1.8
Requires-Dist: rich>=14.0.0
Requires-Dist: python-dotenv>=1.1.0
Requires-Dist: watchfiles>=1.1.0
Requires-Dist: numpy>=1.23.2
Requires-Dist: psutil>=7.2.1
Requires-Dist: litellm>=1.81.0 ; extra == 'all'
Requires-Dist: sentence-transformers>=3.3.1 ; extra == 'all'
Requires-Dist: colpali-engine ; extra == 'all'
Requires-Dist: lancedb>=0.25.0 ; extra == 'all'
Requires-Dist: pyarrow>=19.0.0 ; extra == 'all'
Requires-Dist: asyncpg>=0.31.0 ; extra == 'all'
Requires-Dist: pgvector>=0.4.2 ; extra == 'all'
Requires-Dist: qdrant-client>=1.6.0 ; extra == 'all'
Requires-Dist: sqlite-vec>=0.1.6 ; extra == 'all'
Requires-Dist: surrealdb>=1.0.0 ; extra == 'all'
Requires-Dist: google-api-python-client>=2.0.0 ; extra == 'all'
Requires-Dist: google-auth>=2.0.0 ; extra == 'all'
Requires-Dist: aiobotocore>=2.0.0 ; extra == 'all'
Requires-Dist: aiohttp>=3.9.0 ; extra == 'all'
Requires-Dist: pymysql>=1.1.0 ; extra == 'all'
Requires-Dist: aiomysql>=0.2.0 ; extra == 'all'
Requires-Dist: aiobotocore>=2.0.0 ; extra == 'amazon-s3'
Requires-Dist: colpali-engine ; extra == 'colpali'
Requires-Dist: aiohttp>=3.9.0 ; extra == 'doris'
Requires-Dist: pymysql>=1.1.0 ; extra == 'doris'
Requires-Dist: aiomysql>=0.2.0 ; extra == 'doris'
Requires-Dist: google-api-python-client>=2.0.0 ; extra == 'google-drive'
Requires-Dist: google-auth>=2.0.0 ; extra == 'google-drive'
Requires-Dist: lancedb>=0.25.0 ; extra == 'lancedb'
Requires-Dist: pyarrow>=19.0.0 ; extra == 'lancedb'
Requires-Dist: litellm>=1.81.0 ; extra == 'litellm'
Requires-Dist: asyncpg>=0.31.0 ; extra == 'postgres'
Requires-Dist: pgvector>=0.4.2 ; extra == 'postgres'
Requires-Dist: qdrant-client>=1.6.0 ; extra == 'qdrant'
Requires-Dist: sentence-transformers>=3.3.1 ; extra == 'sentence-transformers'
Requires-Dist: sqlite-vec>=0.1.6 ; extra == 'sqlite'
Requires-Dist: surrealdb>=1.0.0 ; extra == 'surrealdb'
Provides-Extra: all
Provides-Extra: amazon_s3
Provides-Extra: colpali
Provides-Extra: doris
Provides-Extra: google_drive
Provides-Extra: lancedb
Provides-Extra: litellm
Provides-Extra: postgres
Provides-Extra: qdrant
Provides-Extra: sentence_transformers
Provides-Extra: sqlite
Provides-Extra: surrealdb
License-File: THIRD_PARTY_NOTICES.html
Summary: With CocoIndex, users declare the transformation, CocoIndex creates & maintains an index, and keeps the derived index up to date based on source update, with minimal computation and changes.
Keywords: indexing,real-time,incremental,pipeline,search,ai,etl,rag,dataflow,context-engineering
Author-email: CocoIndex <cocoindex.io@gmail.com>
License-Expression: Apache-2.0
Requires-Python: >=3.11
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
Project-URL: Homepage, https://cocoindex.io/

<p align="center">
    <img src="https://cocoindex.io/images/github.svg" alt="CocoIndex">
</p>

<h1 align="center">Data transformation for AI</h1>

<div align="center">

[![GitHub](https://img.shields.io/github/stars/cocoindex-io/cocoindex?color=5B5BD6)](https://github.com/cocoindex-io/cocoindex)
[![Documentation](https://img.shields.io/badge/Documentation-394e79?logo=readthedocs&logoColor=00B9FF)](https://cocoindex.io/docs/getting_started/quickstart)
[![License](https://img.shields.io/badge/license-Apache%202.0-5B5BD6?logoColor=white)](https://opensource.org/licenses/Apache-2.0)
[![PyPI version](https://img.shields.io/pypi/v/cocoindex?color=5B5BD6)](https://pypi.org/project/cocoindex/)
<!--[![PyPI - Downloads](https://img.shields.io/pypi/dm/cocoindex)](https://pypistats.org/packages/cocoindex) -->
[![PyPI Downloads](https://static.pepy.tech/badge/cocoindex/month)](https://pepy.tech/projects/cocoindex)
[![CI](https://github.com/cocoindex-io/cocoindex/actions/workflows/CI.yml/badge.svg?event=push&color=5B5BD6)](https://github.com/cocoindex-io/cocoindex/actions/workflows/CI.yml)
[![release](https://github.com/cocoindex-io/cocoindex/actions/workflows/release.yml/badge.svg?event=push&color=5B5BD6)](https://github.com/cocoindex-io/cocoindex/actions/workflows/release.yml)
[![Link Check](https://github.com/cocoindex-io/cocoindex/actions/workflows/links.yml/badge.svg)](https://github.com/cocoindex-io/cocoindex/actions/workflows/links.yml)
[![prek](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/j178/prek/master/docs/assets/badge-v0.json)](https://github.com/j178/prek)
[![Discord](https://img.shields.io/discord/1314801574169673738?logo=discord&color=5B5BD6&logoColor=white)](https://discord.com/invite/zpA9S2DR7s)

</div>

<div align="center">
    <a href="https://trendshift.io/repositories/13939" target="_blank"><img src="https://trendshift.io/api/badge/repositories/13939" alt="cocoindex-io%2Fcocoindex | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
</div>

Ultra performant data transformation framework for AI, with core engine written in Rust. Support incremental processing and data lineage out-of-box.  Exceptional developer velocity. Production-ready at day 0.

⭐ Drop a star to help us grow!

<div align="center">

<!-- Keep these links. Translations will automatically update with the README. -->
[Deutsch](https://readme-i18n.com/cocoindex-io/cocoindex?lang=de) |
[English](https://readme-i18n.com/cocoindex-io/cocoindex?lang=en) |
[Español](https://readme-i18n.com/cocoindex-io/cocoindex?lang=es) |
[français](https://readme-i18n.com/cocoindex-io/cocoindex?lang=fr) |
[日本語](https://readme-i18n.com/cocoindex-io/cocoindex?lang=ja) |
[한국어](https://readme-i18n.com/cocoindex-io/cocoindex?lang=ko) |
[Português](https://readme-i18n.com/cocoindex-io/cocoindex?lang=pt) |
[Русский](https://readme-i18n.com/cocoindex-io/cocoindex?lang=ru) |
[中文](https://readme-i18n.com/cocoindex-io/cocoindex?lang=zh)

</div>

</br>

<p align="center">
    <img src="https://cocoindex.io/images/transformation.svg" alt="CocoIndex Transformation">
</p>

</br>

CocoIndex makes it effortless to transform data with AI, and keep source data and target in sync. Whether you’re building a vector index, creating knowledge graphs for context engineering or performing any custom data transformations — goes beyond SQL.

</br>

<p align="center">
<img alt="CocoIndex Features" src="https://cocoindex.io/images/venn2.svg" />
</p>

</br>

## Exceptional velocity

Just declare transformation in dataflow with ~100 lines of python

```python
# import
data['content'] = flow_builder.add_source(...)

# transform
data['out'] = data['content']
    .transform(...)
    .transform(...)

# collect data
collector.collect(...)

# export to db, vector db, graph db ...
collector.export(...)
```

CocoIndex follows the idea of [Dataflow](https://en.wikipedia.org/wiki/Dataflow_programming) programming model. Each transformation creates a new field solely based on input fields, without hidden states and value mutation. All data before/after each transformation is observable, with lineage out of the box.

**Particularly**, developers don't explicitly mutate data by creating, updating and deleting. They just need to define transformation/formula for a set of source data.

## Plug-and-Play Building Blocks

Native builtins for different source, targets and transformations. Standardize interface, make it 1-line code switch between different components - as easy as assembling building blocks.

<p align="center">
    <img src="https://cocoindex.io/images/components.svg" alt="CocoIndex Features">
</p>

## Data Freshness

CocoIndex keep source data and target in sync effortlessly.

<p align="center">
    <img src="https://github.com/user-attachments/assets/f4eb29b3-84ee-4fa0-a1e2-80eedeeabde6" alt="Incremental Processing" width="700">
</p>

It has out-of-box support for incremental indexing:

- minimal recomputation on source or logic change.
- (re-)processing necessary portions; reuse cache when possible

## Quick Start

If you're new to CocoIndex, we recommend checking out

- 📖 [Documentation](https://cocoindex.io/docs)
- ⚡  [Quick Start Guide](https://cocoindex.io/docs/getting_started/quickstart)
- 🎬 [Quick Start Video Tutorial](https://youtu.be/gv5R8nOXsWU?si=9ioeKYkMEnYevTXT)

### Setup

1. Install CocoIndex Python library

> **Note**: CocoIndex v1 is currently in preview (pre-release). Use the `--pre` flag with pip, or configure your package manager to allow pre-releases.

```sh
pip install -U --pre cocoindex
```

1. [Install Postgres](https://cocoindex.io/docs/getting_started/installation#-install-postgres) if you don't have one. CocoIndex uses it for incremental processing.

2. (Optional) Install Claude Code skill for enhanced development experience. Run these commands in [Claude Code](https://claude.com/claude-code):

```
/plugin marketplace add cocoindex-io/cocoindex-claude
/plugin install cocoindex-skills@cocoindex
```

## 📖 Documentation

For detailed documentation, visit [CocoIndex Documentation](https://cocoindex.io/docs), including a [Quickstart guide](https://cocoindex.io/docs/getting_started/quickstart).

## 🤝 Contributing

We love contributions from our community ❤️. For details on contributing or running the project for development, check out our [contributing guide](https://cocoindex.io/docs/about/contributing).

## 👥 Community

Welcome with a huge coconut hug 🥥⋆｡˚🤗. We are super excited for community contributions of all kinds - whether it's code improvements, documentation updates, issue reports, feature requests, and discussions in our Discord.

Join our community here:

- 🌟 [Star us on GitHub](https://github.com/cocoindex-io/cocoindex)
- 👋 [Join our Discord community](https://discord.com/invite/zpA9S2DR7s)
- ▶️ [Subscribe to our YouTube channel](https://www.youtube.com/@cocoindex-io)
- 📜 [Read our blog posts](https://cocoindex.io/blogs/)

## Support us

We are constantly improving, and more features and examples are coming soon. If you love this project, please drop us a star ⭐ at GitHub repo [![GitHub](https://img.shields.io/github/stars/cocoindex-io/cocoindex?color=5B5BD6)](https://github.com/cocoindex-io/cocoindex) to stay tuned and help us grow.

## License

CocoIndex is Apache 2.0 licensed.

