問題描述
我正在閱讀有關(guān) Python 中的多處理模塊的各種教程,但無法理解為什么/何時調(diào)用 process.join()
.例如,我偶然發(fā)現(xiàn)了這個例子:
I am reading various tutorials on the multiprocessing module in Python, and am having trouble understanding why/when to call process.join()
. For example, I stumbled across this example:
nums = range(100000)
nprocs = 4
def worker(nums, out_q):
""" The worker function, invoked in a process. 'nums' is a
list of numbers to factor. The results are placed in
a dictionary that's pushed to a queue.
"""
outdict = {}
for n in nums:
outdict[n] = factorize_naive(n)
out_q.put(outdict)
# Each process will get 'chunksize' nums and a queue to put his out
# dict into
out_q = Queue()
chunksize = int(math.ceil(len(nums) / float(nprocs)))
procs = []
for i in range(nprocs):
p = multiprocessing.Process(
target=worker,
args=(nums[chunksize * i:chunksize * (i + 1)],
out_q))
procs.append(p)
p.start()
# Collect all results into a single result dict. We know how many dicts
# with results to expect.
resultdict = {}
for i in range(nprocs):
resultdict.update(out_q.get())
# Wait for all worker processes to finish
for p in procs:
p.join()
print resultdict
據(jù)我了解,process.join()
會阻塞調(diào)用進程,直到調(diào)用join方法的進程完成執(zhí)行.我也相信上面代碼示例中啟動的子進程在完成目標(biāo)函數(shù)后,即在他們將結(jié)果推送到out_q
之后完成執(zhí)行.最后,我相信 out_q.get()
會阻塞調(diào)用過程,直到有結(jié)果被提取.因此,如果您考慮代碼:
From what I understand, process.join()
will block the calling process until the process whose join method was called has completed execution. I also believe that the child processes which have been started in the above code example complete execution upon completing the target function, that is, after they have pushed their results to the out_q
. Lastly, I believe that out_q.get()
blocks the calling process until there are results to be pulled. Thus, if you consider the code:
resultdict = {}
for i in range(nprocs):
resultdict.update(out_q.get())
# Wait for all worker processes to finish
for p in procs:
p.join()
主進程被 out_q.get()
調(diào)用阻塞,直到每個工作進程 完成將其結(jié)果推送到隊列.因此,當(dāng)主進程退出 for 循環(huán)時,每個子進程都應(yīng)該已完成執(zhí)行,對嗎?
the main process is blocked by the out_q.get()
calls until every single worker process has finished pushing its results to the queue. Thus, by the time the main process exits the for loop, each child process should have completed execution, correct?
如果是這樣的話,此時是否有任何理由調(diào)用 p.join()
方法?不是所有的工作進程都已經(jīng)完成,那么這如何導(dǎo)致主進程等待所有工作進程完成"?我之所以問,主要是因為我在多個不同的示例中看到了這一點,并且我很好奇我是否未能理解某些內(nèi)容.
If that is the case, is there any reason for calling the p.join()
methods at this point? Haven't all worker processes already finished, so how does that cause the main process to "wait for all worker processes to finish?" I ask mainly because I have seen this in multiple different examples, and I am curious if I have failed to understand something.
推薦答案
嘗試運行這個:
import math
import time
from multiprocessing import Queue
import multiprocessing
def factorize_naive(n):
factors = []
for div in range(2, int(n**.5)+1):
while not n % div:
factors.append(div)
n //= div
if n != 1:
factors.append(n)
return factors
nums = range(100000)
nprocs = 4
def worker(nums, out_q):
""" The worker function, invoked in a process. 'nums' is a
list of numbers to factor. The results are placed in
a dictionary that's pushed to a queue.
"""
outdict = {}
for n in nums:
outdict[n] = factorize_naive(n)
out_q.put(outdict)
# Each process will get 'chunksize' nums and a queue to put his out
# dict into
out_q = Queue()
chunksize = int(math.ceil(len(nums) / float(nprocs)))
procs = []
for i in range(nprocs):
p = multiprocessing.Process(
target=worker,
args=(nums[chunksize * i:chunksize * (i + 1)],
out_q))
procs.append(p)
p.start()
# Collect all results into a single result dict. We know how many dicts
# with results to expect.
resultdict = {}
for i in range(nprocs):
resultdict.update(out_q.get())
time.sleep(5)
# Wait for all worker processes to finish
for p in procs:
p.join()
print resultdict
time.sleep(15)
然后打開任務(wù)管理器.您應(yīng)該能夠看到 4 個子進程在被操作系統(tǒng)終止之前進入僵尸狀態(tài)幾秒鐘(由于加入調(diào)用):
And open the task-manager. You should be able to see that the 4 subprocesses go in zombie state for some seconds before being terminated by the OS(due to the join calls):
在更復(fù)雜的情況下,子進程可能永遠(yuǎn)處于僵尸狀態(tài)(就像您在其他 問題),如果你創(chuàng)建了足夠多的子進程,你可以填充進程表,導(dǎo)致操作系統(tǒng)出現(xiàn)問題(這可能會殺死你的主進程以避免失敗).
With more complex situations the child processes could stay in zombie state forever(like the situation you was asking about in an other question), and if you create enough child-processes you could fill the process table causing troubles to the OS(which may kill your main process to avoid failures).
這篇關(guān)于何時在進程上調(diào)用 .join()?的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網(wǎng)!