Metadata-Version: 2.1
Name: monotonic-alignment-search
Version: 0.1.1
Summary: Monotonically align text and speech
Author-email: Jaehyeon Kim <jaywalnut310@gmail.com>
Maintainer-email: Enno Hermann <enno.hermann@gmail.com>
License: MIT
Project-URL: Repository, https://github.com/eginhard/monotonic_alignment_search
Project-URL: Issues, https://github.com/eginhard/monotonic_alignment_search/issues
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: Intended Audience :: Developers
Classifier: Operating System :: POSIX :: Linux
Classifier: Operating System :: MacOS
Classifier: Operating System :: Microsoft :: Windows
Classifier: License :: OSI Approved :: MIT License
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Multimedia :: Sound/Audio :: Speech
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy >=1.21.6 ; python_version < "3.11"
Requires-Dist: torch >=1.12.1 ; python_version < "3.11"
Requires-Dist: numpy >=1.24.4 ; python_version == "3.11"
Requires-Dist: torch >=2.1.1 ; python_version == "3.11"
Requires-Dist: numpy >=1.26.4 ; python_version == "3.12"
Requires-Dist: torch >=2.2 ; python_version == "3.12"
Requires-Dist: numpy >=2.1.3 ; python_version == "3.13"
Requires-Dist: torch >=2.5.1 ; python_version == "3.13"

<!--
SPDX-FileCopyrightText: Enno Hermann

SPDX-License-Identifier: MIT
-->

# Monotonic Alignment Search (MAS)

[![PyPI - License](https://img.shields.io/pypi/l/monotonic-alignment-search)](https://github.com/eginhard/monotonic_alignment_search/blob/main/LICENSE)
![PyPI - Python Version](https://img.shields.io/pypi/pyversions/monotonic-alignment-search)
[![PyPI - Version](https://img.shields.io/pypi/v/monotonic-alignment-search)](https://pypi.org/project/monotonic-alignment-search)
![GithubActions](https://github.com/eginhard/monotonic_alignment_search/actions/workflows/test.yml/badge.svg)
![GithubActions](https://github.com/eginhard/monotonic_alignment_search/actions/workflows/lint.yml/badge.svg)

Implementation of MAS from [Glow-TTS](https://github.com/jaywalnut310/glow-tts)
for easy reuse in other projects.

## Installation

```bash
pip install monotonic-alignment-search
```

Wheels are provided for Linux, Mac, and Windows.

## Usage

MAS can find the most probable alignment between a text sequence `t_x` and a
speech sequence `t_y`.

```python
from monotonic_alignment_search import maximum_path

# value (torch.Tensor): [batch_size, t_x, t_y]
# mask  (torch.Tensor): [batch_size, t_x, t_y]
path = maximum_path(value, mask, implementation="cython")
```

The `implementation` argument allows choosing from one of the following
implementations:

- `cython` (default): Cython-optimised
- `numpy`: pure Numpy

## References

This implementation is taken from the original [Glow-TTS
repository](https://github.com/jaywalnut310/glow-tts). Consider citing the
Glow-TTS paper when using this project:

```bibtex
@inproceedings{kim2020_glowtts,
    title={Glow-{TTS}: A Generative Flow for Text-to-Speech via Monotonic Alignment Search},
    author={Jaehyeon Kim and Sungwon Kim and Jungil Kong and Sungroh Yoon},
    booktitle={Proceedings of Neur{IPS}},
    year={2020},
}
```
