How To Make The Python Code With Two For Loop Run Faster(is There A Python Way Of Doing Mathematica's Parallelize)?
I am completely new to python or any such programming language. I have some experience with Mathematica. I have a mathematical problem which though Mathematica solves with her own
Solution 1:
To speed up python, you have three options:
- deal with specific bottlenecks in the program (as suggested in @LutzL's comment)
- try to speed up the code by compiling it into C using cython (or including C code using weave or similar techniques). Since the time-consuming computations in your case are not in python code proper but in scipy modules (at least I believe they are), this would not help you very much here.
- implement multiprocessing as you suggested in your original question. This will speed up your code to up to X (slightly less than) times faster if you have X cores. Unfortunately this is rather complicated in python.
Implementing multiprocessing - example using the prototype loop from the original question
I assume that the computations you do inside the nested loops in your prototype code are actually independent from one another. Since your prototype code is incomplete, I am not sure this is the case, however. Otherwise it will, of course, not work. I will give an example using not your differential equation problem for the fun function but a prototype of the same signature (input and output variables).
import numpy as np
import scipy.integrate
import multiprocessing as mp
deffun(y, t, b, c):
# replace this function with whatever function you want to work with# (this one is the example function from the scipy docs for odeint)
theta, omega = y
dydt = [omega, -b*omega - c*np.sin(theta)]
return dydt
#definitions of work thread and write thread functionsdefrun_thread(input_queue, output_queue):
# run threads will pull tasks from the input_queue, push results into output_queuewhileTrue:
try:
queueitem = input_queue.get(block = False)
iflen(queueitem) == 3:
a, q, t = queueitem
sol1 = scipy.integrate.odeint(fun, [1, 0], t, args=( a, q))[..., 0]
F = 1 + sol1[157]
output_queue.put((q, a, F))
except Exception as e:
print(str(e))
print("Queue exhausted, terminating")
breakdefwrite_thread(queue):
# write thread will pull results from output_queue, write them to outputfile.txt
f1 = open("outputfile.txt", "w")
whileTrue:
try:
queueitem = queue.get(block = False)
if queueitem[0] == "TERMINATE":
f1.close()
breakelse:
q, a, F = queueitem
print("{} {} {} \n".format(q, a, F))
f1.write("{} {} {} \n".format(q, a, F))
except:
# necessary since it will throw an error whenever output_queue is emptypass# define time point sequence
t = np.linspace(0, 10, 201)
# prepare input and output Queues
mpM = mp.Manager()
input_queue = mpM.Queue()
output_queue = mpM.Queue()
# prepare tasks, collect them in input_queuefor q in np.linspace(0.0, 4.0, 100):
for a in np.linspace(-2.0, 7.0, 100):
# Your computations as commented here will now happen in run_threads as defined above and created below# print('Solving for q = {}, a = {}'.format(q,a))# sol1 = scipy.integrate.odeint(fun, [1, 0], t, args=( a, q))[..., 0]# print(t[157])# F = 1 + sol1[157]
input_tupel = (a, q, t)
input_queue.put(input_tupel)
# create threads
thread_number = mp.cpu_count()
procs_list = [mp.Process(target = run_thread , args = (input_queue, output_queue)) for i inrange(thread_number)]
write_proc = mp.Process(target = write_thread, args = (output_queue,))
# start threadsfor proc in procs_list:
proc.start()
write_proc.start()
# wait for run_threads to finishfor proc in procs_list:
proc.join()
# terminate write_thread
output_queue.put(("TERMINATE",))
write_proc.join()
Explanation
- We define the individual problems (or rather their parameters) before commencing computation; we collect them in an input Queue.
- We define a function (
run_thread
) that is run in the threads. This function computes individual problems until there are none left in the input Queue; it pushes the results into an output Queue. - We start as many such threads as we have CPUs.
- We start an additional thread (
write_thread
) for collecting the results from the output queue and writing them into a file.
Caveats
- For smaller problems, you can run multiprocessing without Queues. However, if the number of individual computations is large, you will exceed the maximum number of threads the kernel will allow you after which the kernel kills your program.
- There are differences between different operating systems for how multiprocessing works. The example above will work on Linux (perhaps also on other Unix like systems such as Mac and BSD), not on Windows. The reason is that Windows does not have a fork() system call. (I do not have access to a Windows, can therefore not try to implement it for Windows.)
Post a Comment for "How To Make The Python Code With Two For Loop Run Faster(is There A Python Way Of Doing Mathematica's Parallelize)?"