List Comprehension Convolution
Solution 1:
I am not sure if this improves the performance, but it is a list comprehension as you asked
convo = [-sum(g[j] * w[i - j] * dz for j in range(0, i)) for i in range(0, len(g))]
A faster implementation using NumPy:
# make the matrices square
g = np.repeat(g, g.shape[0]).reshape(g.shape[0], g.shape[0], order='F')
w = np.repeat(w, w.shape[0]).reshape(w.shape[0], w.shape[0], order='F')
# take the lower half of g
g = np.tril(g, k=-1)
# shift each column by its index number
# see: https://stackoverflow.com/questions/20360675/roll-rows-of-a-matrix-independently
rows_w, column_indices_w = np.ogrid[:w.shape[0], :w.shape[1]]
shift = np.arange(w.shape[0])
shift[shift < 0] += w.shape[1]
w = w[rows_w, column_indices_w - shift[:,np.newaxis]].T
convo = np.sum(g * w, axis=1) * dz
For it to work it needs both w and g to be of the same size, but otherwise I'm sure a workaround can be found.
I hope this is a more acceptable speedup for you? Always try to rewrite your logic/problem into vector/matrix multiplications.
Solution 2:
The inner loop can be replaced by the sum function (don't override it with a variable of the same name)
Then you append the outer loop to the end of that
[-sum(g[j]*w[i-j]*dz for j in range(i)) for i in range(len(g))]
Solution 3:
Don't use list comprehensions for performance reasons
Use
- Numba
- Cython
- Vectorized Numpy operations
Numba
import numba as nb
import numpy as np
import time
@nb.njit(fastmath=True)
def halfconvolution(g,w,dz):
convo=np.empty(g.shape[0],dtype=g.dtype)
for i in range(g.shape[0]):
sum=0.
for j in range(0,i):
sum+=g[j]*w[(i-j)]*dz
convo[i] = -sum
return convo
g=np.random.rand(1000)
w=np.random.rand(1000)
dz=0.15
t1=time.time()
for i in range(1000):
#res=halfconvolution(g,w,dz)
res=[-sum(g[j]*w[i-j]*dz for j in range(i)) for i in range(len(g))]
print(time.time()-t1)
print("Done")
Performance
List Comprehension: 0.27s per iteration
Numba Version: 0.6ms per iteration
So there is a factor 500 between this two versions. If you wan't to call this function on multiple arrays at once, you can also parallelize this problem easily and you should get at least another "Number of Cores" speed up.
Post a Comment for "List Comprehension Convolution"