問題描述
當我使用生成器作為帶有 multiprocessing.Pool.map 函數的可迭代參數時:
When I use a generator as an iterable argument with multiprocessing.Pool.map function:
pool.map(func, iterable=(x for x in range(10)))
在 func
被調用之前,生成器似乎已經完全耗盡.
It seems that the generator is fully exhausted before func
is ever called.
我想生成每個項目并將其傳遞給每個進程,謝謝
I want to yield each item and pass it to each process, thanks
推薦答案
multiprocessing.map
在處理之前將沒有 __len__
方法的可迭代對象轉換為列表.這樣做是為了幫助計算塊大小,池使用它來對工作參數進行分組并降低調度作業的往返成本.這不是最優的,尤其是當 chunksize 為 1 時,但由于 map
必須以一種或另一種方式耗盡迭代器,它通常不是一個重大問題.
multiprocessing.map
converts iterables without a __len__
method to a list before processing. This is done to aid the calculation of chunksize, which the pool uses to group worker arguments and reduce the round trip cost of scheduling jobs. This is not optimal, especially when chunksize is 1, but since map
must exhaust the iterator one way or the other, its usually not a significant issue.
相關代碼在pool.py
中.注意它對 len
的使用:
The relevant code is in pool.py
. Notice its use of len
:
def _map_async(self, func, iterable, mapper, chunksize=None, callback=None,
error_callback=None):
'''
Helper function to implement map, starmap and their async counterparts.
'''
if self._state != RUN:
raise ValueError("Pool not running")
if not hasattr(iterable, '__len__'):
iterable = list(iterable)
if chunksize is None:
chunksize, extra = divmod(len(iterable), len(self._pool) * 4)
if extra:
chunksize += 1
if len(iterable) == 0:
chunksize = 0
這篇關于如何將生成器用作具有多處理映射功能的可迭代對象的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!