Metadata-Version: 2.4
Name: smirk
Version: 0.2.0
Classifier: Programming Language :: Rust
Classifier: Programming Language :: Python :: Implementation :: CPython
Classifier: Programming Language :: Python :: Implementation :: PyPy
Requires-Dist: transformers >=4.40, <5
License-File: LICENSE
Summary: A chemically complete tokenizer for OpenSMILES
Home-Page: https://eeg.engin.umich.edu/smirk
Author: Anoushka Bhutani
Author-email: Alexius Wadell <awadell@umich.edu>
License: Apache-2.0
Requires-Python: >=3.9
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
Project-URL: source, https://github.com/BattModels/smirk
Project-URL: homepage, https://eeg.engin.umich.edu/smirk
Project-URL: documentation, https://eeg.engin.umich.edu/smirk
Project-URL: issues, https://github.com/BattModels/smirk/issues
Project-URL: changelog, https://github.com/BattModels/smirk/blob/main/CHANGELOG.md

# Smirk: A Tokenizer for OpenSMILES

<div align="center" display="flex" >

![GitHub License](https://img.shields.io/github/license/BattModels/smirk)
<a href="https://doi.org/10.1021/acs.jcim.5c01856">![paper](https://img.shields.io/badge/paper-10.1021%2Facs.jcim.5c01856-blue)</a>
<a href="https://doi.org/10.5281/zenodo.13761262">![data](https://img.shields.io/badge/data-10.5281%2Fzenodo.13761262-blue)</a>
<a href="https://arxiv.org/abs/2409.15370">![arXiv:2409.15370](https://img.shields.io/badge/cs.LG-2409.15370-b31b1b?style=flat&amp;logo=arxiv&amp;logoColor=red)</a>

</div>

Smirk is a chemistry-specific tokenizer that provides complete coverage of the [OpenSMILES](http://opensmiles.org)
specification, that is built using Rust 🦀 and [HuggingFace's tokenizers](https://huggingface.co/docs/tokenizers) 🤗.
Installation is easy, and Smirk works out-of-the-box with the [HuggingFace](https://huggingface.co/docs) ecosystem.

Check our [documentation](https://eeg.engin.umich.edu/smirk) to see `smirk` in action, or [read the paper](https://doi.org/10.1021/acs.jcim.5c01856) to learn
about tokenization for molecular foundation models.

## Installation

```
pip install smirk
```

