Checking In Between Values With Numpy Python
Solution 1:
The answer from a previous question:
In [173]: Numbers = np.array([3, 4, 5, 7, 8, 10,20])
...: Formating = np.array([0, 2 , 5, 12, 15, 22])
...: x = np.sort(Numbers);
...: l = np.searchsorted(x, Formating, side='left')
...:
In [174]: l
Out[174]: array([0, 0, 2, 6, 6, 7])
In [175]: for i in range(len(l)-1):
...: if l[i] >= l[i+1]:
...: print('Numbers between %d,%d = _0_' % (Formating[i], Formating[i+1]))
...: else:
...: print('Numbers between %d,%d = %s' % (Formating[i], Formating[i+1], ','.jo
...: in(map(str, list(x[l[i]:l[i+1]])))))
...:
Numbers between 0,2 = _0_
Numbers between 2,5 = 3,4
Numbers between 5,12 = 5,7,8,10
Numbers between 12,15 = _0_
Numbers between 15,22 = 20Something that works fine with lists - in fact faster with lists than arrays:
In[182]: foriinrange(len(Formating)-1):
...: print([x for x in Numbers if (Formating[i]<=x<Formating[i+1])])
...:
[][3, 4][5, 7, 8, 10][][20]A version with iteration on Formating, but not Numbers. Rather similar to the version using searchsorted. I'm not sure which will be faster:
In [177]: for i in range(len(Formating)-1):
...: idx = (Formating[i]<=Numbers)&(Numbers<Formating[i+1])
...: print(Numbers[idx])
...:
[]
[34]
[ 57810]
[]
[20]
We could get the idx mask for all values of Formating at once:
In [183]: mask=(Formating[:-1,None]<=Numbers)&(Numbers<Formating[1:,None])
In [184]: mask
Out[184]:
array([[False, False, False, False, False, False, False],
[ True, True, False, False, False, False, False],
[False, False, True, True, True, True, False],
[False, False, False, False, False, False, False],
[False, False, False, False, False, False, True]])
In [185]: N=Numbers[:,None].repeat(5,1).T # 5= len(Formating)-1In [186]: N
Out[186]:
array([[ 3, 4, 5, 7, 8, 10, 20],
[ 3, 4, 5, 7, 8, 10, 20],
[ 3, 4, 5, 7, 8, 10, 20],
[ 3, 4, 5, 7, 8, 10, 20],
[ 3, 4, 5, 7, 8, 10, 20]])
In [187]: np.ma.masked_array(N,~mask)
Out[187]:
masked_array(
data=[[--, --, --, --, --, --, --],
[3, 4, --, --, --, --, --],
[--, --, 5, 7, 8, 10, --],
[--, --, --, --, --, --, --],
[--, --, --, --, --, --, 20]],
mask=[[ True, True, True, True, True, True, True],
[False, False, True, True, True, True, True],
[ True, True, False, False, False, False, True],
[ True, True, True, True, True, True, True],
[ True, True, True, True, True, True, False]],
fill_value=999999)
Your lists are apparent there. But the list display still requires iteraiton:
In[188]: forrowinmask:
...: print(Numbers[row])
[][3 4][ 5 7 8 10][][20]I'll let you time test these alternatives with this or more realistic data. I suspect a pure list version is fastest for small problems, but I'm not sure how the others will scale.
edit
Following questions ask about sums. np.ma.sum, or the masked arrays own sum method, sums the unmasked values, effectively filling the masked values with 0.
In [253]: np.ma.masked_array(N,~mask).sum(axis=1)
Out[253]:
masked_array(data=[--, 7, 30, --, 20],
mask=[ True, False, False, True, False],
fill_value=999999)
In [256]: np.ma.masked_array(N,~mask).filled(0)
Out[256]:
array([[ 0, 0, 0, 0, 0, 0, 0],
[ 3, 4, 0, 0, 0, 0, 0],
[ 0, 0, 5, 7, 8, 10, 0],
[ 0, 0, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0, 0, 20]])
Actually we don't need to use the masked array mechanism to get here (though it can be nice visually):
In [258]: N*mask
Out[258]:
array([[ 0, 0, 0, 0, 0, 0, 0],
[ 3, 4, 0, 0, 0, 0, 0],
[ 0, 0, 5, 7, 8, 10, 0],
[ 0, 0, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0, 0, 20]])
In [259]: (N*mask).sum(axis=1)
Out[259]: array([ 0, 7, 30, 0, 20])
Post a Comment for "Checking In Between Values With Numpy Python"