Metadata-Version: 2.2
Name: hstrat
Version: 1.23.8
Summary: hstrat enables phylogenetic inference on distributed digital evolution populations
Keywords: hstrat
Author-Email: Matthew Andres Moreno <m.more500@gmail.com>
License: MIT license
Classifier: Development Status :: 2 - Pre-Alpha
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Natural Language :: English
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Project-URL: homepage, https://github.com/mmore500/hstrat
Project-URL: documentation, https://hstrat.readthedocs.io
Project-URL: repository, https://github.com/mmore500/hstrat
Requires-Python: >=3.10
Requires-Dist: alifedata_phyloinformatics_convert>=0.17.0
Requires-Dist: anytree>=2.8.0
Requires-Dist: astropy>=5.3.4
Requires-Dist: bitarray>=2.6.2
Requires-Dist: bitstring>=3.1.9
Requires-Dist: dendropy>=4.5.2
Requires-Dist: Deprecated>=1.2.13
Requires-Dist: downstream>=1.21.8
Requires-Dist: fishersrc>=0.1.15
Requires-Dist: iterpop>=0.3.4
Requires-Dist: interval_search>=0.3.1
Requires-Dist: joblib>=1.0.0
Requires-Dist: joinem>=0.9.2
Requires-Dist: keyname>=0.6.0
Requires-Dist: lazy_loader>=0.4
Requires-Dist: lru-dict>=1.1.7
Requires-Dist: matplotlib>=3.5.2
Requires-Dist: mmh3>=3.0.0
Requires-Dist: networkx>=2.6.3
Requires-Dist: numpy>=2
Requires-Dist: more-itertools>=8.13.0
Requires-Dist: mpmath>=1.1.0
Requires-Dist: opytional>=0.1.0
Requires-Dist: ordered_set>=4.1.0
Requires-Dist: polars[pyarrow,rt64]!=1.37.*,>=1.10.0
Requires-Dist: packaging>=23.0
Requires-Dist: phyloframe>=0.1.1
Requires-Dist: pandas<3,>=2
Requires-Dist: pandera>=0.21.0
Requires-Dist: prettytable>=3.5.0
Requires-Dist: pyarrow>=16.0.0
Requires-Dist: python-slugify>=6.1.2
Requires-Dist: safe_assert>=0.2.0
Requires-Dist: scikit-learn>=0.17
Requires-Dist: seaborn>=0.11.0
Requires-Dist: sortedcontainers>=2.4.0
Requires-Dist: statsmodels>=0.14.4
Requires-Dist: typing_extensions>=4.7.1
Requires-Dist: tqdm>=4.62.3
Provides-Extra: jit
Requires-Dist: cppimport>=22.8.2; extra == "jit"
Requires-Dist: numba!=0.62.0,>=0.60.0; extra == "jit"
Requires-Dist: setuptools<81,>=75.0.0; extra == "jit"
Provides-Extra: phylo-extra
Requires-Dist: biopython>=1.79; extra == "phylo-extra"
Requires-Dist: phylotrackpy>=0.2.4; extra == "phylo-extra"
Provides-Extra: testing
Requires-Dist: black==22.10.0; extra == "testing"
Requires-Dist: biopython>=1.79; extra == "testing"
Requires-Dist: colorclade>=0.3.0; extra == "testing"
Requires-Dist: contexttimer==0.3.3; extra == "testing"
Requires-Dist: coverage>=7.6.1; extra == "testing"
Requires-Dist: flake8==3.7.8; extra == "testing"
Requires-Dist: isort>=5.12.0; extra == "testing"
Requires-Dist: iterify==0.1.0; extra == "testing"
Requires-Dist: more-itertools>=8.13.0; extra == "testing"
Requires-Dist: mypy==0.991; extra == "testing"
Requires-Dist: phylotrackpy==0.2.4; extra == "testing"
Requires-Dist: pytest-xdist==2.5.0; extra == "testing"
Requires-Dist: pytest==8.3.3; extra == "testing"
Requires-Dist: pyyaml==6.0.1; extra == "testing"
Requires-Dist: ruff==0.1.11; extra == "testing"
Requires-Dist: safe-assert==0.2.0; extra == "testing"
Requires-Dist: scipy==1.14.1; extra == "testing"
Requires-Dist: setuptools<81,>=75.0.0; extra == "testing"
Requires-Dist: teeplot>=1.2.0; extra == "testing"
Requires-Dist: tox==3.24.0; extra == "testing"
Requires-Dist: tqdist==1.0; extra == "testing"
Provides-Extra: release
Requires-Dist: bumpver==2022.1120; extra == "release"
Requires-Dist: pybind11==2.13.6; extra == "release"
Requires-Dist: setuptools==75.5.0; extra == "release"
Requires-Dist: setuptools-scm==8.1.0; extra == "release"
Requires-Dist: twine==1.14.0; extra == "release"
Requires-Dist: uv==0.4.18; extra == "release"
Requires-Dist: wheel==0.45.0; extra == "release"
Provides-Extra: docs
Requires-Dist: ipython==8.12.3; extra == "docs"
Requires-Dist: ipykernel==6.26.0; extra == "docs"
Requires-Dist: nbsphinx==0.9.3; extra == "docs"
Requires-Dist: pybind11==2.13.6; extra == "docs"
Requires-Dist: pyyaml==6.0.1; extra == "docs"
Requires-Dist: Sphinx==8.1.3; extra == "docs"
Requires-Dist: sphinx_rtd_theme==3.0.2; extra == "docs"
Description-Content-Type: text/markdown

![hstrat wordmark](docs/assets/hstrat-wordmark.png)

[
![PyPi](https://img.shields.io/pypi/v/hstrat.svg)
](https://pypi.python.org/pypi/hstrat)
[
![codecov](https://codecov.io/gh/mmore500/hstrat/branch/master/graph/badge.svg?token=JwMfFOpBBD)
](https://codecov.io/gh/mmore500/hstrat)
[
![Codacy Badge](https://app.codacy.com/project/badge/Grade/9ab14d415aa9458d97b4cf760b95f874)
](https://www.codacy.com/gh/mmore500/hstrat/dashboard)
[
![CI](https://github.com/mmore500/hstrat/actions/workflows/ci.yaml/badge.svg)
](https://github.com/mmore500/hstrat/actions)
[
![Read The Docs](https://readthedocs.org/projects/hstrat/badge/?version=latest)
](https://hstrat.readthedocs.io/en/latest/?badge=latest)
[
![GitHub stars](https://img.shields.io/github/stars/mmore500/hstrat.svg?style=round-square&logo=github&label=Stars&logoColor=white)](https://github.com/mmore500/hstrat)
[
![Zenodo](https://zenodo.org/badge/464531144.svg)
](https://zenodo.org/badge/latestdoi/464531144)
[![JOSS](https://joss.theoj.org/papers/10.21105/joss.04866/status.svg)](https://doi.org/10.21105/joss.04866)

_hstrat_ enables phylogenetic inference on distributed digital evolution populations

- Free software: MIT license
- Documentation: <https://hstrat.readthedocs.io>
- Repository: <https://github.com/mmore500/hstrat>

## Install

`python3 -m pip install hstrat`

A containerized release of `hstrat` is available via [ghcr.io](https://ghcr.io/mmore500/hstrat)

```bash
singularity exec docker://ghcr.io/mmore500/hstrat:v1.23.8 python3 -m hstrat --help
```

## Features

_hstrat_ serves to enable **robust, efficient extraction of evolutionary history** from evolutionary simulations where centralized, direct phylogenetic tracking is not feasible.
Namely, in large-scale, **decentralized parallel/distributed evolutionary simulations**, where agents' evolutionary lineages migrate among many cooperating processors over the course of simulation.

_hstrat_ can

- accurately estimate **time since MRCA** among two or several digital agents, even for uneven branch lengths
- **reconstruct phylogenetic trees** for entire populations of evolving digital agents
- **serialize genome annotations** to/from text and binary formats
- provide **low-footprint** genome annotations (e.g., reasonably as low as **64 bits** each)
- be directly configured to satisfy **memory use limits** and/or **inference accuracy requirements**

_hstrat operates just as well in single-processor simulation, but direct phylogenetic tracking using a tool like [phylotrackpy](https://github.com/emilydolson/phylotrackpy/) should usually be preferred in such cases due to its capability for perfect record-keeping given centralized global simulation observability._

## Example Usage

This code briefly demonstrates,

1.  initialization of a population of `HereditaryStratigraphicColumn` of objects,
2.  generation-to-generation transmission of `HereditaryStratigraphicColumn` objects with simple synchronous turnover, and then
3.  reconstruction of phylogenetic history from the final population of `HereditaryStratigraphicColumn` objects.

```python3
from random import choice as rchoice
import alifedata_phyloinformatics_convert as apc
from hstrat import hstrat; print(f"{hstrat.__version__=}")  # when last ran?
from hstrat._auxiliary_lib import seed_random; seed_random(1)  # reproducibility

# initialize a small population of hstrat instrumentation
# (in full simulations, each column would be attached to an individual genome)
population = [hstrat.HereditaryStratigraphicColumn() for __ in range(5)]

# evolve population for 40 generations under drift
for _generation in range(40):
    population = [rchoice(population).CloneDescendant() for __ in population]

# reconstruct estimate of phylogenetic history
alifestd_df = hstrat.build_tree(population, version_pin=hstrat.__version__)
tree_ascii = apc.RosettaTree(alifestd_df).as_dendropy.as_ascii_plot(width=20)
print(tree_ascii)
```

```
hstrat.__version__='1.8.8'
              /--- 1
          /---+
       /--+   \--- 3
       |  |
   /---+  \------- 2
   |   |
+--+   \---------- 0
   |
   \-------------- 4
```

In [actual usage](https://hstrat.readthedocs.io/en/latest/demo-ping.html), each _hstrat_ column would be bundled with underlying genetic material of interest in the simulation --- entire genomes or, in systems with sexual recombination, individual genes.
The _hstrat_ columns are designed to operate as a neutral genetic annotation, enhancing observability of the simulation but not affecting its outcome.

## How it Works

In order to enable phylogenetic inference over fully-distributed evolutionary simulation, hereditary stratigraphy adopts a paradigm akin to phylogenetic work in natural history/biology.
In these fields, phylogenetic history is inferred through comparisons among genetic material of extant organisms, with --- in broad terms --- phylogenetic relatedness established through the extent of genetic similarity between organisms.
Phylogenetic tracking through _hstrat_, similarly, is achieved through analysis of similarity/dissimilarity among genetic material sampled over populations of interest.

Rather than random mutation as with natural genetic material, however, genetic material used by _hstrat_ is structured through _hereditary stratigraphy_.
This methodology, described fully in our documentation, provides strong guarantees on phylogenetic inferential power, minimizes memory footprint, and allows efficient reconstruction procedures.

See [here](https://hstrat.readthedocs.io/en/latest/mechanism.html) for more detail on underlying hereditary stratigraphy methodology.

## Getting Started

Refer to our documentation for a [quickstart guide](https://hstrat.readthedocs.io/en/latest/quickstart.html) and an [annotated end-to-end usage example](https://hstrat.readthedocs.io/en/latest/demo-ping.html).

The `examples/` folder provides extensive usage examples, including

- incorporation of hstrat annotations into a custom genome class,
- automatic stratum retention policy parameterization,
- pairwise and population-level phylogenetic inference, and
- phylogenetic tree reconstruction.

Interested users can find an explanation of how hereditary stratigraphy methodology implemented by _hstrat_ works "under the hood," information on project-specific _hstrat_ configuration, and full API listing for the _hstrat_ package in [the documentation](https://hstrat.readthedocs.io/).

## Citing

If _hstrat_ software or hereditary stratigraphy methodology contributes to a scholarly work, please cite it according to references provided [here](https://hstrat.readthedocs.io/en/latest/citing.html).
We would love to list your project using _hstrat_ in our documentation, see more [here](https://hstrat.readthedocs.io/en/latest/projects.html).

## Credits

This package was created with Cookiecutter and the `audreyr/cookiecutter-pypackage` project template.

## hcat

![hcat](docs/assets/hcat-banner.png)
