天天天色,日韩电影,亚洲av毛片

本文介紹了共享內存中用于多處理的大型 numpy 數組:這種方法有問題嗎?的處理方法，對大家解決問題具有一定的參考價值，需要的朋友們下面隨著小編來一起學習吧！

問題描述

多處理是一個很棒的工具，但使用大內存塊并不是那么直接.您可以在每個進程中加載??塊并將結果轉儲到磁盤上，但有時您需要將結果存儲在內存中.最重要的是，使用花哨的 numpy 功能.

Multiprocessing is a wonderful tool but is not so straight forward to use large memory chunks with it. You can load chunks in each process and dump results on disk but sometimes you need to store the results in the memory. And on top, use the fancy numpy functionality.

我已經閱讀/谷歌了很多，并想出了一些答案:

I have read/googled a lot and came up with some answers:

在共享內存中使用numpy數組進行多處理

在多處理進程之間共享大型只讀 Numpy 數組

Python 多處理全局 numpy 數組

如何如何在 python 子進程之間傳遞大型 numpy 數組而不保存到磁盤?

等等等等等等.

它們都有缺點:不太主流的庫(sharedmem)；全局存儲變量；不太容易閱讀代碼、管道等.

They all have drawbacks: Not-so-mainstream libraries (sharedmem); globally storing variables; not so easy to read code, pipes, etc etc.

我的目標是在我的工作人員中無縫使用 numpy，而不用擔心轉換和其他東西.

My goal was to seamlessly use numpy in my workers without worrying about conversions and stuff.

經過多次試驗，我想出了 this.它適用于我的 ubuntu 16、python 3.6、16GB、8 核機器.與以前的方法相比，我做了很多捷徑".沒有全局共享狀態，沒有需要在 worker 內部轉換為 numpy 的純內存指針，作為進程參數傳遞的大型 numpy 數組等.

After much trials I came up with this. And it works on my ubuntu 16, python 3.6, 16GB, 8 core machine. I did a lot of "shortcuts" compared to previous approaches. No global shared state, no pure memory pointers that need to be converted to numpy inside workers, large numpy arrays passed as process arguments, etc.

Pastebin 鏈接上面，但我會在這里放幾個片段.

Pastebin link above, but I will put few snippets here.

一些進口:

import numpy as np
import multiprocessing as mp
import multiprocessing.sharedctypes
import ctypes

分配一些共享內存并將其包裝到一個 numpy 數組中:

Allocate some shared mem and wrap it into an numpy array:

def create_np_shared_array(shape, dtype, ctype)
     . . . . 
    shared_mem_chunck = mp.sharedctypes.RawArray(ctype, size)
    numpy_array_view = np.frombuffer(shared_mem_chunck, dtype).reshape(shape)
    return numpy_array_view

創建共享數組并在其中放入一些東西

Create shared array and put something in it

src = np.random.rand(*SHAPE).astype(np.float32)
src_shared = create_np_shared_array(SHAPE,np.float32,ctypes.c_float)
dst_shared = create_np_shared_array(SHAPE,np.float32,ctypes.c_float)
src_shared[:] = src[:]  # Some numpy ops accept an 'out' array where to store the results

產生進程:

p = mp.Process(target=lengthly_operation,args=(src_shared, dst_shared, k, k + STEP))
p.start()
p.join()

以下是一些結果(完整參考請參見 pastebin 代碼):

Here are some results (see pastebin code for full reference):

Serial version: allocate mem 2.3741257190704346 exec: 17.092209577560425 total: 19.46633529663086 Succes: True
Parallel with trivial np: allocate mem 2.4535582065582275 spawn  process: 0.00015354156494140625 exec: 3.4581971168518066 total: 5.911908864974976 Succes: False
Parallel with shared mem np: allocate mem 4.535916328430176 (pure alloc:4.014216661453247 copy: 0.5216996669769287) spawn process: 0.00015664100646972656 exec: 3.6783478260040283 total: 8.214420795440674 Succes: True

我還做了一個 cProfile(為什么在分配共享內存時要多花 2 秒?)并意識到有一些對 tempfile.py、{ 的調用'_io.BufferedWriter' 對象的 'write' 方法}.

I also did a cProfile (why 2 extra seconds when allocating shared mem?) and realized that there are some calls to the tempfile.py, {method 'write' of '_io.BufferedWriter' objects}.

問題

我做錯了嗎?
(大型)陣列是否來回腌制而我沒有獲得任何加快速度的東西?請注意，第二次運行(使用常規 np 數組未通過正確性測試)
有沒有辦法進一步改進時序、代碼清晰度等?(針對多處理范例)

備注

我不能使用進程池，因為 mem 必須在 fork 處繼承，而不是作為參數發送.

久久久久久久av_日韩在线中文_看一级毛片视频_日本精品二区_成人深夜福利视频_武道仙尊动漫在线观看

共享內存中用于多處理的大型 numpy 數組:這種方法

問題描述

推薦答案

相關文檔推薦