**Benchmark setup**

.. code-block:: python

    from pandas_vb_common import *
    
    N = 100000
    ngroups = 100
    
    def get_test_data(ngroups=100, n=N):
        unique_groups = range(ngroups)
        arr = np.asarray(np.tile(unique_groups, n / ngroups), dtype=object)
    
        if len(arr) < n:
            arr = np.asarray(list(arr) + unique_groups[:n - len(arr)],
                             dtype=object)
    
        random.shuffle(arr)
        return arr
    
    # aggregate multiple columns
    df = DataFrame({'key1' : get_test_data(ngroups=ngroups),
                    'key2' : get_test_data(ngroups=ngroups),
                    'data1' : np.random.randn(N),
                    'data2' : np.random.randn(N)})
    def f():
        df.groupby(['key1', 'key2']).agg(lambda x: x.values.sum())
    
    simple_series = Series(np.random.randn(N))
    key1 = df['key1']
    

**Benchmark statement**

.. code-block:: python

    df.groupby(['key1', 'key2'])['data1'].agg(lambda x: x.values.sum())

**Performance graph**

.. image:: vbench/figures/groupby_multi_python.png
   :width: 6in