問題描述
有沒有辦法為 python 多處理池中的每個工作人員分配一個唯一的 ID,以便池中特定工作人員運行的作業可以知道哪個工作人員正在運行它?根據文檔, Process
有一個 name
但是
Is there a way to assign each worker in a python multiprocessing pool a unique ID in a way that a job being run by a particular worker in the pool could know which worker is running it? According to the docs, a Process
has a name
but
名稱是一個僅用于識別目的的字符串.它沒有語義.多個進程可以被賦予相同的名稱.
The name is a string used for identification purposes only. It has no semantics. Multiple processes may be given the same name.
對于我的特定用例,我想在一組四個 GPU 上運行一堆作業,并且需要為應該運行作業的 GPU 設置設備號.因為作業的長度不均勻,所以我想確保在前一個作業完成之前嘗試在 GPU 上運行的作業不會在 GPU 上發生沖突(因此這排除了將 ID 預先分配給工作單元提前).
For my particular use-case, I want to run a bunch of jobs on a group of four GPUs, and need to set the device number for the GPU that the job should run on. Because the jobs are of non-uniform length, I want to be sure that I don't have a collision on a GPU of a job trying to run on it before the previous one completes (so this precludes pre-assigning an ID to the unit of work ahead of time).
推薦答案
看起來你想要的很簡單:multiprocessing.current_process()
.例如:
It seems like what you want is simple: multiprocessing.current_process()
. For example:
import multiprocessing
def f(x):
print multiprocessing.current_process()
return x * x
p = multiprocessing.Pool()
print p.map(f, range(6))
輸出:
$ python foo.py
<Process(PoolWorker-1, started daemon)>
<Process(PoolWorker-2, started daemon)>
<Process(PoolWorker-3, started daemon)>
<Process(PoolWorker-1, started daemon)>
<Process(PoolWorker-2, started daemon)>
<Process(PoolWorker-4, started daemon)>
[0, 1, 4, 9, 16, 25]
這會返回進程對象本身,因此進程可以是它自己的身份.您也可以在其上調用 id
以獲得唯一的數字 id ——在 cpython 中,這是進程對象的內存地址,所以我不認為有任何可能性的重疊.最后,您可以使用進程的 ident
或 pid
屬性——但這僅在進程啟動后設置.
This returns the process object itself, so the process can be its own identity. You could also call id
on it for a unique numerical id -- in cpython, this is the memory address of the process object, so I don't think there's any possibility of overlap. Finally, you can use the ident
or the pid
property of the process -- but that's only set once the process is started.
此外,查看源代碼,在我看來,自動生成的名稱(如上面 Process
repr 字符串中的第一個值所示)很可能是唯一的.multiprocessing
為每個進程維護一個 itertools.counter
對象,用于生成 _identity
元組用于它產生的任何子進程.因此頂級進程產生具有單值 id 的子進程,它們產生具有雙值 id 的進程,依此類推.然后,如果沒有名稱傳遞給 Process
構造函數,它只是 使用 ':'.join(...)
根據 _identity 自動生成名稱.然后 Pool
更改名稱使用 replace
處理,自動生成的 id 保持不變.
Furthermore, looking over the source, it seems to me very likely that autogenerated names (as exemplified by the first value in the Process
repr strings above) are unique. multiprocessing
maintains an itertools.counter
object for every process, which is used to generate an _identity
tuple for any child processes it spawns. So the top-level process produces child process with single-value ids, and they spawn process with two-value ids, and so on. Then, if no name is passed to the Process
constructor, it simply autogenerates the name based on the _identity, using ':'.join(...)
. Then Pool
alters the name of the process using replace
, leaving the autogenerated id the same.
這一切的結果是雖然兩個Process
es可能有相同的名字,因為你可能給它們分配了相同的名字創建它們時,如果您不觸摸 name 參數,它們是唯一的.此外,理論上您可以使用 _identity
作為唯一標識符;但我認為他們將這個變量設為私有是有原因的!
The upshot of all this is that although two Process
es may have the same name, because you may assign the same name to them when you create them, they are unique if you don't touch the name parameter. Also, you could theoretically use _identity
as a unique identifier; but I gather they made that variable private for a reason!
上面的一個例子:
import multiprocessing
def f(x):
created = multiprocessing.Process()
current = multiprocessing.current_process()
print 'running:', current.name, current._identity
print 'created:', created.name, created._identity
return x * x
p = multiprocessing.Pool()
print p.map(f, range(6))
輸出:
$ python foo.py
running: PoolWorker-1 (1,)
created: Process-1:1 (1, 1)
running: PoolWorker-2 (2,)
created: Process-2:1 (2, 1)
running: PoolWorker-3 (3,)
created: Process-3:1 (3, 1)
running: PoolWorker-1 (1,)
created: Process-1:2 (1, 2)
running: PoolWorker-2 (2,)
created: Process-2:2 (2, 2)
running: PoolWorker-4 (4,)
created: Process-4:1 (4, 1)
[0, 1, 4, 9, 16, 25]
這篇關于獲取python多處理池中worker的唯一ID的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!