Metadata-Version: 1.1
Name: numpy-indexed
Version: 0.2.8
Summary: Numpy indexing based extensions, such as group_by
Home-page: https://github.com/EelcoHoogendoorn/Numpy_arraysetops_EP
Author: Eelco Hoogendoorn
Author-email: hoogendoorn.eelco@gmail.com
License: Freely Distributable
Description: |Build Status| |Build status|
        
        Numpy indexed operations
        ========================
        
        This package contains functionality for indexed operations on numpy
        ndarrays, providing efficient vectorized functionality such as grouping
        and set operations.
        
        -  Rich and efficient grouping functionality:
        -  splitting of values by key-group
        -  reductions of values by key-group
        -  Generalization of existing array set operation to nd-arrays, such as:
        -  unique
        -  union
        -  difference
        -  exclusive (xor)
        -  contains (in1d)
        -  Some new functions:
        -  indices: numpy equivalent of list.index
        -  count: numpy equivalent of collections.Counter
        -  mode: find the most frequently occuring items in a set
        -  multiplicity: number of occurrences of each key in a sequence
        -  count\_table: like R's table or pandas crosstab, or an ndim version
           of np.bincount
        
        The generalization of the existing array set operations pertains
        primarily to the extension of this functionality to different types of
        key objects, such as keys formed by slices of nd-arrays. For instance,
        we may wish to find the intersection of several sets of graph edges.
        
        Some brief examples to give an impression hereof:
        
        .. code:: python
        
            # three sets of graph edges (doublet of ints)
            edges = np.random.randint(0, 9, (3, 100, 2))
            # find graph edges exclusive to one of three sets
            ex = exclusive(*edges)
            print(ex)
            # which edges are exclusive to the first set?
            print(contains(edges[0], ex))
            # where are the exclusive edges relative to the totality of them?
            print(indices(union(*edges), ex))
            # group and reduce values by identical keys
            values = np.random.rand(100, 20)
            # and so on...
            print(group_by(edges[0]).median(values))
        
        Installation
        ------------
        
        conda install numpy-indexed -c eelcohoogendoorn
        
        Design decisions:
        -----------------
        
        This package builds upon a generalization of the design pattern as can
        be found in numpy.unique. That is, by argsorting an ndarray, subsequent
        operations can be implemented efficiently.
        
        The sorting and related low level operations are encapsulated into a
        hierarchy of Index classes, which allows for efficient lookup of many
        properties for a variety of different key-types. The public API of this
        package is a quite thin wrapper around these Index objects.
        
        The principal information exposed by an Index object is the required
        permutations to map between the original and sorted order of the keys.
        This information can subsequently be used for many purposes, such as
        efficiently finding the set of unique keys, or efficiently performing
        group\_by logic on an array of corresponding values.
        
        The two complex key types currently supported, beyond standard sequences
        of sortable primitive types, are array keys and composite keys. For the
        exact casting rules describing valid sequences of key objects to index
        objects, see as\_index().
        
        Todo and open questions:
        ------------------------
        
        -  What about nesting of key objects? This should be possible too, but
           not fully supported yet
        -  What about floating point nd keys? Currently, they are treated as
           object indices. However, bitwise and floating point equality are not
           the same thing
        -  Add special index classes for things like object arrays of variable
           length strings?
        -  While this package is aimed more at expanding functionality than
           optimizing performance, the most common code paths might benefit from
           some specialization, such as the concatenation of sorted sets
        -  There may be further generalizations that could be made. merge/join
           functionality perhaps?
        
        .. |Build Status| image:: https://travis-ci.org/EelcoHoogendoorn/Numpy_arraysetops_EP.svg?branch=master
           :target: https://travis-ci.org/EelcoHoogendoorn/Numpy_arraysetops_EP
        .. |Build status| image:: https://ci.appveyor.com/api/projects/status/h7w191ovpa9dcfum?svg=true
           :target: https://ci.appveyor.com/project/clinicalgraphics/numpy-arraysetops-ep
        
Keywords: numpy group_by set-operations indexing
Platform: any
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: Intended Audience :: Developers
Classifier: Topic :: Utilities
Classifier: Topic :: Scientific/Engineering
Classifier: License :: Freely Distributable
Classifier: Operating System :: MacOS :: MacOS X
Classifier: Operating System :: Microsoft :: Windows
Classifier: Operating System :: POSIX
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3.5
