Fast Circular Buffer In Python Than The One Using Deque?
I am implementing circular buffer in python using collections.deque to use it for some calculations. This is my original code: clip=moviepy.editor.VideoFileClip('file.mp4') clip_si
Solution 1:
I noticed you changed the code above, but your original code was:
def one():
TempKern=np.array([1,2,3,4,5])
depth=len(TempKern)
buf=deque(np.zeros((2,3)),maxlen=5)
for i in range(10):
buf.append([[i,i+1,i+2],[i+3,i+4,i+5]])
total= + np.sum([np.asarray(buf)[j]*TempKern[j] for j in range(depth)],axis=0)
print('total')
print(total)
return total
You can simplify things greatly and make it run quite a bit faster if you first flatten the arrays for the computation.
def two():
buf = np.zeros((5,6), dtype=np.int32)
for idx, i in enumerate(range(5, 10)):
buf[idx] = np.array([[i,i+1,i+2,i+3,i+4,i+5]], dtype=np.int32)
return (buf.T * np.array([1, 2, 3, 4, 5])).sum(axis=1).reshape((2,3))
The second implementation returns the same values and runs about 4x faster on my machine
one()
>> [[115 130 145]
[160 175 190]] ~ 100µs / loop
two()
>> array([[115, 130, 145],
[160, 175, 190]]) ~~ 26µs / loop
You can further simplify and parameterize this as such:
def three(n, array_shape):
buf = np.zeros((n,array_shape[0]*array_shape[1]), dtype=np.int32)
addit = np.arange(1, n+1, dtype=np.int32)
for idx, i in enumerate(range(n, 2*n)):
buf[idx] = np.arange(i, i+n+1)
return (buf.T * addit).sum(axis=1).reshape(array_shape)
three(5, (2,3))
>> array([[115, 130, 145],
[160, 175, 190]]) ~ 17µs / loop
Note that the second and third version returns a numpy array. You can cast it to a list by using .tolist()
if need be.
Based on your feedback - edit below:
Baca Juga
def four(array_shape):
n = array_shape[0] * array_shape[1] - 1
buf = []
addit = np.arange(1, n+1, dtype=np.int32)
for idx, i in enumerate(range(n, 2*n)):
buf.append(np.arange(i, i+n+1))
buf = np.asarray(buf)
summed = (buf.T * addit).sum(axis=1)
return summed.reshape(array_shape)
Solution 2:
You can have the ring buffer as a numpy array, by doubling the size and slicing:
clipsize = clip.size[::-1]
depth = 30
ringbuffer = np.zeros((2*depth,) + clipsize)
framecounter = 0defnew_filtered_output(image):
global ringbuffer, framecounter
inter_frame = somefunction(image)
idx = framecounter % depth
ringbuffer[idx] = ringbuffer[idx + depth] = inter_frame
buffer = ringbuffer[idx + 1 : idx + 1 + depth]
framecounter += 1# Apply kernel
output = dc + np.sum([buffer[j]*kernel[j] for j inrange(depth)], axis=0)
return output
Now you don't have convert the deque into a numpy array every frame (and every loop iteration..).
As mentioned in the comments, you can apply the kernel more effeciently:
output = dc + np.einsum('ijk,i->jk', buffer, kernel)
Or:
output = dc + np.tensordot(kernel, buffer, axes=1)
Post a Comment for "Fast Circular Buffer In Python Than The One Using Deque?"