問題描述
如果數據僅在子進程生成后可用(使用 multiprocessing.Process)?
How do I give child processes access to data in shared memory if the data is only available after the child processes have been spawned (using multiprocessing.Process)?
我知道 multiprocessing.sharedctypes.RawArray,但我不知道如何讓我的子進程訪問在進程已經啟動后創建的 RawArray
.
I am aware of multiprocessing.sharedctypes.RawArray, but I can't figure out how to give my child processes access to a RawArray
that is created after the processes have already started.
數據由父進程生成,數據量事先不知道.
The data is generated by the parent process, and the amount of data is not known in advance.
如果不是 GIL 我會使用線程來代替這將完成這項任務簡單一點.使用非 CPython 實現不是一種選擇.
If not for the GIL I'd be using threading instead which will make this task a little simpler. Using a non-CPython implementation is not an option.
查看 muliprocessing.sharedctypes,看起來共享 ctype 對象被分配了 使用 mmap
ed 內存.
Looking under the hood of muliprocessing.sharedctypes, it looks like shared ctype objects are allocated using mmap
ed memory.
所以這個問題真的可以歸結為:如果 mmap()
在子進程生成后被父進程調用,子進程能否訪問匿名映射的內存?
So this question really boils down to: Can a child process access an anonymously mapped memory if mmap()
was called by the parent after the child process was spawned?
這有點像 this問題,除了在我的例子中 mmap()
的調用者是父進程而不是子進程.
That's somewhat in the vein of what's being asked in this question, except that in my case the caller of mmap()
is the parent process and not the child process.
我創建了自己的 RawArray
版本,它在底層使用了 shm_open()
.只要標識符(tag
)匹配,生成的共享 ctypes 數組就可以與任何進程共享.
I created my own version of RawArray
that uses shm_open()
under the hood. The resulting shared ctypes array can be shared with any process as long as the identifier (tag
) matches.
請參閱此答案 了解詳細信息和示例.
See this answer for details and an example.
推薦答案
您的問題聽起來非常適合 posix_ipc
或 sysv_ipc
模塊,它們公開用于共享內存、信號量和消息隊列的 POSIX 或 SysV API.那里的特征矩陣包括在他提供的模塊中挑選的極好的建議.
Your problem sounds like a perfect fit for the posix_ipc
or sysv_ipc
modules, which expose either the POSIX or SysV APIs for shared memory, semaphores, and message queues. The feature matrix there includes excellent advice for picking amongst the modules he provides.
匿名 mmap(2)
區域的問題在于,您無法輕松地與其他進程共享它們——如果它們是文件支持的,這很容易,但如果您不這樣做實際上需要文件來做其他事情,感覺很傻.您可以在 clone(2)
系統調用中使用 CLONE_VM
標志,如果這是在 C 中,但我不想嘗試使用它帶有一個可能對內存安全做出假設的語言解釋器.(即使在 C 語言中也會有點危險,因為五年后的維護程序員可能也對 CLONE_VM
行為感到震驚.)
The problem with anonymous mmap(2)
areas is that you cannot easily share them with other processes -- if they were file-backed, it'd be easy, but if you don't actually need the file for anything else, it feels silly. You could use the CLONE_VM
flag to the clone(2)
system call if this were in C, but I wouldn't want to try using it with a language interpreter that probably makes assumptions about memory safety. (It'd be a little dangerous even in C, as maintenance programmers five years from now might also be shocked by the CLONE_VM
behavior.)
但是 SysV 和更新的 POSIX 共享內存映射甚至允許不相關的進程通過標識符附加和分離共享內存,因此您需要做的就是與使用映射的進程共享創建映射的進程的標識符,然后當您在映射中操作數據時,它們可同時供所有進程使用,而無需任何額外的解析開銷.shm_open(3)
函數返回一個 int
,在以后調用 ftruncate(2)
和 時用作文件描述符mmap(2)
,因此其他進程可以使用共享內存段,而無需在文件系統中創建文件——即使使用它的所有進程都已退出,該內存仍將持續存在.(對于 Unix 來說可能有點奇怪,但它很靈活.)
But the SysV and newer POSIX shared memory mappings allow even unrelated processes to attach and detach from shared memory by identifier, so all you need to do is share the identifier from the processes that create the mappings with the processes that consume the mappings, and then when you manipulate data within the mappings, they are available to all processes simultaneously without any additional parsing overhead. The shm_open(3)
function returns an int
that is used as a file descriptor in later calls to ftruncate(2)
and then mmap(2)
, so other processes can use the shared memory segment without a file being created in the filesystem -- and this memory will persist even if all processes using it have exited. (A little strange for Unix, perhaps, but it is flexible.)
這篇關于在子進程已經啟動后授予對共享內存的訪問權限的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!