Checking In Between Values With Numpy Python
Solution 1:
The answer from a previous question:
In [173]: Numbers = np.array([3, 4, 5, 7, 8, 10,20])
...: Formating = np.array([0, 2 , 5, 12, 15, 22])
...: x = np.sort(Numbers);
...: l = np.searchsorted(x, Formating, side='left')
...:
In [174]: l
Out[174]: array([0, 0, 2, 6, 6, 7])
In [175]: for i in range(len(l)-1):
...: if l[i] >= l[i+1]:
...: print('Numbers between %d,%d = _0_' % (Formating[i], Formating[i+1]))
...: else:
...: print('Numbers between %d,%d = %s' % (Formating[i], Formating[i+1], ','.jo
...: in(map(str, list(x[l[i]:l[i+1]])))))
...:
Numbers between 0,2 = _0_
Numbers between 2,5 = 3,4
Numbers between 5,12 = 5,7,8,10
Numbers between 12,15 = _0_
Numbers between 15,22 = 20
Something that works fine with lists - in fact faster with lists than arrays:
In[182]: foriinrange(len(Formating)-1):
...: print([x for x in Numbers if (Formating[i]<=x<Formating[i+1])])
...:
[][3, 4][5, 7, 8, 10][][20]
A version with iteration on Formating
, but not Numbers
. Rather similar to the version using searchsorted
. I'm not sure which will be faster:
In [177]: for i in range(len(Formating)-1):
...: idx = (Formating[i]<=Numbers)&(Numbers<Formating[i+1])
...: print(Numbers[idx])
...:
[]
[34]
[ 57810]
[]
[20]
We could get the idx
mask for all values of Formating
at once:
In [183]: mask=(Formating[:-1,None]<=Numbers)&(Numbers<Formating[1:,None])
In [184]: mask
Out[184]:
array([[False, False, False, False, False, False, False],
[ True, True, False, False, False, False, False],
[False, False, True, True, True, True, False],
[False, False, False, False, False, False, False],
[False, False, False, False, False, False, True]])
In [185]: N=Numbers[:,None].repeat(5,1).T # 5= len(Formating)-1In [186]: N
Out[186]:
array([[ 3, 4, 5, 7, 8, 10, 20],
[ 3, 4, 5, 7, 8, 10, 20],
[ 3, 4, 5, 7, 8, 10, 20],
[ 3, 4, 5, 7, 8, 10, 20],
[ 3, 4, 5, 7, 8, 10, 20]])
In [187]: np.ma.masked_array(N,~mask)
Out[187]:
masked_array(
data=[[--, --, --, --, --, --, --],
[3, 4, --, --, --, --, --],
[--, --, 5, 7, 8, 10, --],
[--, --, --, --, --, --, --],
[--, --, --, --, --, --, 20]],
mask=[[ True, True, True, True, True, True, True],
[False, False, True, True, True, True, True],
[ True, True, False, False, False, False, True],
[ True, True, True, True, True, True, True],
[ True, True, True, True, True, True, False]],
fill_value=999999)
Your lists are apparent there. But the list display still requires iteraiton:
In[188]: forrowinmask:
...: print(Numbers[row])
[][3 4][ 5 7 8 10][][20]
I'll let you time test these alternatives with this or more realistic data. I suspect a pure list version is fastest for small problems, but I'm not sure how the others will scale.
edit
Following questions ask about sums. np.ma.sum
, or the masked arrays own sum
method, sums the unmasked values, effectively filling the masked values with 0.
In [253]: np.ma.masked_array(N,~mask).sum(axis=1)
Out[253]:
masked_array(data=[--, 7, 30, --, 20],
mask=[ True, False, False, True, False],
fill_value=999999)
In [256]: np.ma.masked_array(N,~mask).filled(0)
Out[256]:
array([[ 0, 0, 0, 0, 0, 0, 0],
[ 3, 4, 0, 0, 0, 0, 0],
[ 0, 0, 5, 7, 8, 10, 0],
[ 0, 0, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0, 0, 20]])
Actually we don't need to use the masked array mechanism to get here (though it can be nice visually):
In [258]: N*mask
Out[258]:
array([[ 0, 0, 0, 0, 0, 0, 0],
[ 3, 4, 0, 0, 0, 0, 0],
[ 0, 0, 5, 7, 8, 10, 0],
[ 0, 0, 0, 0, 0, 0, 0],
[ 0, 0, 0, 0, 0, 0, 20]])
In [259]: (N*mask).sum(axis=1)
Out[259]: array([ 0, 7, 30, 0, 20])
Post a Comment for "Checking In Between Values With Numpy Python"