I'm using multiprocessing
, and specifically a Pool
to spin off a couple of 'threads' to do a bunch of slow jobs that I have. However, for some reason, I can't get the main thread to rejoin, even though all of the children appear to have died.
Resolved: It appears the answer to this question is to just launch multiple Process
objects, rather than using a Pool
. It's not abundantly clear why, but I suspect the remaining process is a manager for the pool and it's not dying when the processes finish. If anyone else has this problem, this is the answer.
Main Thread
pool = Pool(processes=12,initializer=thread_init)
for x in xrange(0,13):
pool.apply_async(thread_dowork)
pool.close()
sys.stderr.write("Waiting for jobs to terminate
")
pool.join()
The xrange(0,13)
is one more than the number of processes because I thought I had an off by one, and one process wasn't getting a job, so wasn't dying and I wanted to force it to take a job. I have tried it with 12 as well.
Multiprocessing Functions
def thread_init():
global log_out
log_out = open('pool_%s.log'%os.getpid(),'w')
sys.stderr = log_out
sys.stdout = log_out
log_out.write("Spawned")
log_out.flush()
log_out.write(" Complete
")
log_out.flush()
def thread_dowork():
log_out.write("Entered function
")
log_out.flush()
#Do Work
log_out.write("Exiting ")
log_out.flush()
log_out.close()
sys.exit(0)
The output of the logfiles for all 12 children is:
Spawned
Complete
Entered function
Exiting
The main thread prints 'Waiting for jobs to terminate', and then just sits there.
top
shows only one copy of the script (the main one I believe). htop
shows two copies, one of which is the one from top, and the other one of which is something else. Based on its PID, it's none of the children either.
Does anyone know something I don't?
See Question&Answers more detail:os