Metadata-Version: 2.4
Name: pysurprise
Version: 1.2.0
Summary: Python bindings for the Surprise network community structure metric
Author: José Marín
Maintainer: José Marín
License-Expression: GPL-3.0-or-later
Project-URL: Homepage, https://github.com/raldecoa/SurpriseMe
Project-URL: Repository, https://github.com/raldecoa/SurpriseMe
Project-URL: Issues, https://github.com/raldecoa/SurpriseMe/issues
Keywords: networks,community-detection,surprise,graph,clustering,bioinformatics
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Science/Research
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: C++
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.8
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Bio-Informatics
Classifier: Topic :: Scientific/Engineering :: Information Analysis
Requires-Python: >=3.8
Description-Content-Type: text/markdown
Provides-Extra: algorithms
Requires-Dist: igraph>=0.10; extra == "algorithms"
Provides-Extra: benchmarks
Requires-Dist: networkx>=2.6; extra == "benchmarks"
Provides-Extra: deepwalk
Requires-Dist: gensim>=4.0; extra == "deepwalk"
Requires-Dist: scikit-learn>=1.0; extra == "deepwalk"
Requires-Dist: networkx>=2.6; extra == "deepwalk"
Requires-Dist: numpy>=1.21; extra == "deepwalk"
Provides-Extra: node2vec
Requires-Dist: node2vec>=0.4; extra == "node2vec"
Requires-Dist: scikit-learn>=1.0; extra == "node2vec"
Requires-Dist: networkx>=2.6; extra == "node2vec"
Requires-Dist: numpy>=1.21; extra == "node2vec"
Provides-Extra: vgae
Requires-Dist: torch>=1.12; extra == "vgae"
Requires-Dist: scipy>=1.7; extra == "vgae"
Requires-Dist: scikit-learn>=1.0; extra == "vgae"
Requires-Dist: networkx>=2.6; extra == "vgae"
Requires-Dist: numpy>=1.21; extra == "vgae"
Provides-Extra: all
Requires-Dist: igraph>=0.10; extra == "all"
Requires-Dist: networkx>=2.6; extra == "all"
Requires-Dist: gensim>=4.0; extra == "all"
Requires-Dist: scikit-learn>=1.0; extra == "all"
Requires-Dist: numpy>=1.21; extra == "all"
Requires-Dist: node2vec>=0.4; extra == "all"
Requires-Dist: torch>=1.12; extra == "all"
Requires-Dist: scipy>=1.7; extra == "all"

# pySurprise

Compute the **Surprise** metric and run community detection algorithms on
complex networks. C++ core via pybind11 for maximum performance.

## Installation

```bash
pip install pysurprise                # core + algorithms
pip install pysurprise[algorithms]    # optional: igraph for RN
```

## Quick start

```python
import pysurprise

edges = [("a", "b"), ("a", "c"), ("b", "c"),
         ("c", "d"), ("d", "e"), ("d", "f"), ("e", "f")]
partition = {"a": 0, "b": 0, "c": 0, "d": 1, "e": 1, "f": 1}

pysurprise.surprise(edges, partition)   # → 3.21...
```

Integer labels and list partitions also work:

```python
edges = [(0, 1), (0, 2), (1, 2), (2, 3), (3, 4), (3, 5), (4, 5)]
partition = [0, 0, 0, 1, 1, 1]
pysurprise.surprise(edges, partition)
```

## Algorithms

Eight community detection algorithms, all returning `dict[str, int]`:

```python
from pysurprise import algorithms

partition = algorithms.cpm(edges)
results  = algorithms.run_all(edges)   # run all 8 and compare
```

| Algorithm | Function | Reference |
|---|---|---|
| CPM | `cpm()` | Traag *et al.*, Phys Rev E 84 (2011) |
| Infomap | `infomap()` | Rosvall & Bergstrom, PNAS 105 (2008) |
| RB | `rb()` | Reichardt & Bornholdt, Phys Rev E 74 (2006) |
| RN | `rn()` | Ronhovde & Nussinov, Phys Rev E 81 (2010) |
| RNSC | `rnsc()` | King *et al.*, Bioinformatics 20 (2004) |
| SCluster | `scluster()` | Aldecoa & Marín, PLoS ONE 5 (2010) |
| UVCluster | `uvcluster()` | Arnau *et al.*, Bioinformatics 21 (2005) |
| Duch-Arenas | `duch_arenas()` | Duch & Arenas, Phys Rev E 72 (2005) |

## Benchmarks

Generate synthetic networks with known ground-truth communities:

```python
from pysurprise.benchmarks import lfr_generate, rc_benchmark

# LFR benchmark (power-law degree & community sizes)
edges, gt, sizes = lfr_generate(1000, mu=0.3, seed=42)

# Relaxed Caveman benchmark (open / closed / removal degradation)
edges, gt = rc_benchmark(512, 16, degradation=0.3, mode="open")
```

Both support paper presets from Aldecoa & Marín (2011, 2012, 2013).

## API summary

| Function | Description |
|---|---|
| `surprise(edges, partition)` | Compute Surprise (accepts int or str labels) |
| `compute_surprise(F, M, n, p)` | Low-level: four-parameter Surprise |
| `log_hyper_probability(F, M, n, j)` | Single hypergeometric term (log₁₀) |
| `algorithms.run_all(edges)` | Run all algorithms and return results |
| `benchmarks.lfr_generate(n, mu, ...)` | LFR benchmark generation |
| `benchmarks.rc_benchmark(n, k, D, ...)` | RC benchmark generation |

## Credits

Developed by José Marín, based on the original
[SurpriseMe](https://github.com/raldecoa/SurpriseMe) software by
Rodrigo Aldecoa and Ignacio Marín. The bundled C++ binaries (Surprise
computation, Jerarca, CPM, RB, RNSC, RN, Infomap) originate from
SurpriseMe. The LFR benchmark binary is from
[Lancichinetti *et al.*](https://sites.google.com/site/andrealancichinetti/files)

## References

> Aldecoa R, Marín I (2014). *SurpriseMe: an integrated tool for network
> community structure characterization using Surprise maximization.*
> Bioinformatics 30(7): 1041–1042.

> Aldecoa R, Marín I (2011). *Deciphering network community structure by
> Surprise.* PLoS ONE 6(9): e24195.

> Aldecoa R, Marín I (2013). *Surprise maximization reveals the community
> structure of complex networks.* Scientific Reports 3: 1060.

> Aldecoa R, Marín I (2012). *Closed benchmarks for network community
> structure characterization.* Physical Review E 85: 026109.

> Lancichinetti A, Fortunato S, Radicchi F (2008). *Benchmark graphs for
> testing community detection algorithms.* Physical Review E 78: 046110.

## License

GPL-3.0-or-later
