Skip to content Skip to sidebar Skip to footer

Fast Combinations Without Replacement For Arrays - Numpy / Python

I'm after generating efficiently pairwise combinations from 1D array(s). Itertools is just too inefficient with if n > 1000 E.g. [1, 2, 3, 4] magic code... Out[2]: array([[1,

Solution 1:

I. Pairwise-combinations

One way would be with numba to gain memory and hence performance efficiency -

from numba import njit

@njit
def pairwise_combs_numba(a):
    n = len(a)
    L = n*(n-1)//2out = np.empty((L,2),dtype=a.dtype)
    iterID = 0for i inrange(n):
        for j inrange(i+1,n):
            out[iterID,0] = a[i]
            out[iterID,1] = a[j]
            iterID += 1returnout

Another NumPy based one would be using np.broadcast_to to get meshgrid views and then masking -

def pairwise_combs_mask(a):
    n = len(a)
    L = n*(n-1)//2out = np.empty((L,2),dtype=a.dtype)
    m = ~np.tri(len(a),dtype=bool)
    out[:,0] = np.broadcast_to(a[:,None],(n,n))[m]
    out[:,1] = np.broadcast_to(a,(n,n))[m]
    returnout

II. Triplet-combinations

We will extend the same methodology to get ourselves triplet-combinations -

@njit
def triplet_combs_numba(a):
    n = len(a)
    L = n*(n-1)*(n-2)//6out= np.empty((L,3),dtype=a.dtype)
    iterID =0for i inrange(n):
        for j inrange(i+1,n):
            for k inrange(j+1,n):
                out[iterID,0] = a[i]
                out[iterID,1] = a[j]
                out[iterID,2] = a[k]
                iterID +=1returnout

def triplet_combs_mask(a):
    n = len(a)
    L = n*(n-1)*(n-2)//6out= np.empty((L,3),dtype=a.dtype)
    r = np.arange(n)
    m = (r[:,None,None]<r[:,None]) & (r[:,None]<r)
    out[:,0] = np.broadcast_to(a[:,None,None],(n,n,n))[m]
    out[:,1] = np.broadcast_to(a[None,:,None],(n,n,n))[m]
    out[:,2] = np.broadcast_to(a[None,None,:],(n,n,n))[m]
    returnout

Combinations for higher orders would extend like-wise.

Sample runs -

In [54]: a = np.array([3,9,4,1,7])

In [55]: pairwise_combs_numba(a)
Out[55]: 
array([[3, 9],
       [3, 4],
       [3, 1],
       [3, 7],
       [9, 4],
       [9, 1],
       [9, 7],
       [4, 1],
       [4, 7],
       [1, 7]])

In [56]: triplet_combs_numba(a)
Out[56]: 
array([[3, 9, 4],
       [3, 9, 1],
       [3, 9, 7],
       [3, 4, 1],
       [3, 4, 7],
       [3, 1, 7],
       [9, 4, 1],
       [9, 4, 7],
       [9, 1, 7],
       [4, 1, 7]])

Timings (including Python's builtin - itertools.combinations) -

In [68]: a = np.random.rand(4000)

In [69]: %timeit pairwise_combs_numba(a)
    ...: %timeit pairwise_combs_mask(a)
    ...: %timeit list(itertools.combinations(a, 2))
10 loops, best of 3: 52.2 ms per loop
10 loops, best of 3: 146 ms per loop
1 loop, best of 3: 597 ms per loop

In [70]: a = np.random.rand(400)

In [71]: %timeit triplet_combs_numba(a)
    ...: %timeit triplet_combs_mask(a)
    ...: %timeit list(itertools.combinations(a, 3))
10 loops, best of 3: 98.5 ms per loop
1 loop, best of 3: 352 ms per loop
1 loop, best of 3: 795 ms per loop

Post a Comment for "Fast Combinations Without Replacement For Arrays - Numpy / Python"