問題描述
只是實驗和學習,我知道如何創(chuàng)建一個可以通過多個進程訪問的共享字典,但我不確定如何保持字典同步.我相信 defaultdict
說明了我遇到的問題.
Just experimenting and learning, and I know how to create a shared dictionary that can be accessed with multiple proceses but I'm not sure how to keep the dict synced. defaultdict
, I believe, illustrates the problem I'm having.
from collections import defaultdict
from multiprocessing import Pool, Manager, Process
#test without multiprocessing
s = 'mississippi'
d = defaultdict(int)
for k in s:
d[k] += 1
print d.items() # Success! result: [('i', 4), ('p', 2), ('s', 4), ('m', 1)]
print '*'*10, ' with multiprocessing ', '*'*10
def test(k, multi_dict):
multi_dict[k] += 1
if __name__ == '__main__':
pool = Pool(processes=4)
mgr = Manager()
multi_d = mgr.dict()
for k in s:
pool.apply_async(test, (k, multi_d))
# Mark pool as closed -- no more tasks can be added.
pool.close()
# Wait for tasks to exit
pool.join()
# Output results
print multi_d.items() #FAIL
print '*'*10, ' with multiprocessing and process module like on python site example', '*'*10
def test2(k, multi_dict2):
multi_dict2[k] += 1
if __name__ == '__main__':
manager = Manager()
multi_d2 = manager.dict()
for k in s:
p = Process(target=test2, args=(k, multi_d2))
p.start()
p.join()
print multi_d2 #FAIL
第一個結果有效(因為它不使用 multiprocessing
),但我無法讓它與 multiprocessing
一起使用.我不知道如何解決它,但我認為可能是因為它沒有被同步(并在以后加入結果)或者可能是因為在 multiprocessing
我不知道如何設置 defaultdict(int)
到字典中.
The first result works(because its not using multiprocessing
), but I'm having problems getting it to work with multiprocessing
. I'm not sure how to solve it but I think there might be due to it not being synced(and joining the results later) or maybe because within multiprocessing
I cannot figure how to set defaultdict(int)
to the dictionary.
任何關于如何使它工作的幫助或建議都會很棒!
Any help or suggestions on how to get this to work would be great!
推薦答案
您可以繼承 BaseManager
并注冊其他類型以進行共享.在默認 AutoProxy
生成的類型不起作用的情況下,您需要提供合適的代理類型.對于defaultdict
,如果只需要訪問dict
中已經(jīng)存在的屬性,可以使用DictProxy
.
You can subclass BaseManager
and register additional types for sharing. You need to provide a suitable proxy type in cases where the default AutoProxy
-generated type does not work. For defaultdict
, if you only need to access the attributes that are already present in dict
, you can use DictProxy
.
from multiprocessing import Pool
from multiprocessing.managers import BaseManager, DictProxy
from collections import defaultdict
class MyManager(BaseManager):
pass
MyManager.register('defaultdict', defaultdict, DictProxy)
def test(k, multi_dict):
multi_dict[k] += 1
if __name__ == '__main__':
pool = Pool(processes=4)
mgr = MyManager()
mgr.start()
multi_d = mgr.defaultdict(int)
for k in 'mississippi':
pool.apply_async(test, (k, multi_d))
pool.close()
pool.join()
print multi_d.items()
這篇關于將 defaultdict 與多處理一起使用?的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網(wǎng)!