Metadata-Version: 2.4
Name: tessbind
Version: 1.1.0
Summary: Tesseract pybind11 bindings
Author-email: Enno Richter <enno@nerdworks.de>
License-Expression: Apache-2.0
Project-URL: Documentation, https://tessbind.readthedocs.io/
Project-URL: Bug Tracker, https://github.com/elohmeier/tessbind/issues
Project-URL: Discussions, https://github.com/elohmeier/tessbind/discussions
Project-URL: Changelog, https://github.com/elohmeier/tessbind/releases
Classifier: Development Status :: 1 - Planning
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Science/Research
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3 :: Only
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering
Classifier: Typing :: Typed
Requires-Python: >=3.12
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: typing-extensions>=4.6; python_version < "3.11"
Provides-Extra: dev
Requires-Dist: pytest>=6; extra == "dev"
Requires-Dist: pytest-cov>=3; extra == "dev"
Provides-Extra: docs
Requires-Dist: furo>=2023.08.17; extra == "docs"
Requires-Dist: myst-parser>=0.13; extra == "docs"
Requires-Dist: sphinx>=7.0; extra == "docs"
Requires-Dist: sphinx-autodoc-typehints; extra == "docs"
Requires-Dist: sphinx-copybutton; extra == "docs"
Provides-Extra: test
Requires-Dist: pytest>=6; extra == "test"
Requires-Dist: pytest-cov>=3; extra == "test"
Dynamic: license-file

# tessbind

[![Actions Status][actions-badge]][actions-link]

[![PyPI version][pypi-version]][pypi-link]
[![PyPI platforms][pypi-platforms]][pypi-link]

Python 3.12+ bindings for Tesseract built with pybind11. The package vendors the native dependencies (leptonica, libpng, zlib) so you only need Tesseract's trained data files available at runtime.

## Installation

```bash
pip install tessbind
```

Tesseract language data must be discoverable. If it is not installed in a default location (e.g., `/usr/share/tesseract-ocr/5/tessdata` on Linux or the Homebrew Cellar on macOS), set `TESSDATA_PREFIX` to the directory that contains the `tessdata` folder.

## Usage

`TessbindManager` wraps the underlying API in a context manager and exposes the recognized UTF-8 text plus per-word confidences:

```python
from pathlib import Path

from tessbind import PageSegMode, TessbindManager

img_bytes = Path("tests/hello.png").read_bytes()

with TessbindManager(lang="eng", page_seg_mode=PageSegMode.SINGLE_LINE) as tb:
    text, confidences = tb.ocr_image_bytes(img_bytes)

print(text)         # -> Hello, World!
print(confidences)  # list of word-level confidences (0-100)
```

Use the `page_seg_mode` setter to change segmentation between calls, or omit it to rely on Tesseract's default.

## Development

- `uv sync --extra test` to create the venv and build vendored libraries.
- `uv run pytest -m "not slow"` to run the test suite.

<!-- SPHINX-START -->

<!-- prettier-ignore-start -->
[actions-badge]:            https://github.com/elohmeier/tessbind/workflows/CI/badge.svg
[actions-link]:             https://github.com/elohmeier/tessbind/actions
[pypi-link]:                https://pypi.org/project/tessbind/
[pypi-platforms]:           https://img.shields.io/pypi/pyversions/tessbind
[pypi-version]:             https://img.shields.io/pypi/v/tessbind

<!-- prettier-ignore-end -->
