Metadata-Version: 2.4
Name: rustling
Version: 0.6.0
Classifier: Development Status :: 2 - Pre-Alpha
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Education
Classifier: Intended Audience :: Information Technology
Classifier: Intended Audience :: Science/Research
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Rust
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Programming Language :: Python :: 3.14
Classifier: Topic :: Scientific/Engineering
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Classifier: Topic :: Scientific/Engineering :: Human Machine Interfaces
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Classifier: Topic :: Text Processing
Classifier: Topic :: Text Processing :: General
Classifier: Topic :: Text Processing :: Indexing
Classifier: Topic :: Text Processing :: Linguistic
License-File: LICENSE.md
Summary: A blazingly fast library for computational linguistics
Keywords: computational-linguistics,linguistics,natural-language-processing,nlp,text-processing,word-segmentation,part-of-speech-tagging,language-models,ngrams,childes,talkbank,chat,averaged-perceptron,hidden-markov-model,longest-string-matching
Author-email: "Jackson L. Lee" <jacksonlunlee@gmail.com>
License-Expression: MIT
Requires-Python: >=3.10
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
Project-URL: Changelog, https://github.com/jacksonllee/rustling/blob/main/CHANGELOG.md
Project-URL: Python Documentation, https://rustling.readthedocs.io
Project-URL: Rust Documentation, https://docs.rs/rustling
Project-URL: Source, https://github.com/jacksonllee/rustling

<img src="https://raw.githubusercontent.com/jacksonllee/rustling/main/python/docs/_static/logo-with-text.svg" alt="Rustling" height="120">

[![PyPI](https://img.shields.io/pypi/v/rustling.svg)](https://pypi.org/project/rustling/)
[![crates.io](https://img.shields.io/crates/v/rustling.svg)](https://crates.io/crates/rustling)

Rustling is a blazingly fast library for computational linguistics.
It is written in Rust, with Python bindings.

Documentation: [Python](https://rustling.readthedocs.io/) | [Rust](https://docs.rs/rustling)

## Features

- N-grams
- Language models
- Hidden Markov model
- Word segmentation
- Part-of-speech tagging
- CHAT parsing for TalkBank and CHILDES data

## Performance

| Component | Task | Speedup | vs. |
|---|---|---|---|
| **Language Models** | Fit | **10x** | NLTK |
|  | Score | **1.9x** | NLTK |
|  | Generate | **106--114x** | NLTK |
| **Word Segmentation** | LongestStringMatching | **9x** | wordseg |
| **POS Tagging** | Training | **5x** | NLTK |
|  | Tagging | **18x** | NLTK |
| **HMM** | Fit | **13x** | hmmlearn |
|  | Predict | **0.9x** | hmmlearn |
|  | Score | **5x** | hmmlearn |
| **CHAT Parsing** | Reading from a ZIP archive | **43x** | pylangacq |
|  | Reading from strings | **70x** | pylangacq |
|  | Parsing utterances | **15x** | pylangacq |
|  | Parsing tokens | **9x** | pylangacq |

See [`benchmarks/`](https://github.com/jacksonllee/rustling/tree/main/benchmarks) for reproduction scripts.


## Installation

### Python

```bash
pip install rustling
```

### Rust

```bash
cargo add rustling
```

## License

MIT License

