一级色网站,免费黄色看片,日本亚洲网站

本文介紹了python - 如何并行化python numpy中的總和計算?的處理方法，對大家解決問題具有一定的參考價值，需要的朋友們下面隨著小編來一起學習吧！

問題描述

我有一個要計算的總和，但在并行化代碼時遇到了困難.我試圖并行化的計算有點復雜(它同時使用 numpy 數組和 scipy 稀疏矩陣).它吐出一個 numpy 數組，我想對大約 1000 個計算的輸出數組求和.理想情況下，我會保留所有迭代的運行總和.但是，我一直無法弄清楚如何做到這一點.

I have a sum that I'm trying to compute, and I'm having difficulty parallelizing the code. The calculation I'm trying to parallelize is kind of complex (it uses both numpy arrays and scipy sparse matrices). It spits out a numpy array, and I want to sum the output arrays from about 1000 calculations. Ideally, I would keep a running sum over all the iterations. However, I haven't been able to figure out how to do this.

到目前為止，我已經嘗試將 joblib 的 Parallel 函數和 pool.map 函數與 python 的多處理包一起使用.對于這兩者，我使用了一個返回 numpy 數組的內部函數.這些函數返回一個列表，我將其轉換為一個 numpy 數組，然后求和.

So far, I've tried using joblib's Parallel function and the pool.map function with python's multiprocessing package. For both of these, I use an inner function that returns a numpy array. These functions return a list, which I convert to a numpy array and then sum over.

但是，在 joblib Parallel 函數完成所有迭代后，主程序永遠不會繼續運行(看起來原始作業處于掛起狀態，使用 0% CPU).當我使用 pool.map 時，在所有迭代完成后出現內存錯誤.

However, after the joblib Parallel function completes all iterations, the main program never continues running (it looks like the original job is in a suspended state, using 0% CPU). When I use pool.map, I get memory errors after all the iterations are complete.

有沒有辦法簡單地并行化數組的運行總和?

Is there a way to simply parallelize a running sum of arrays?

編輯:目標是執行以下操作，但并行除外.

Edit: The goal is to do something like the following, except in parallel.

def summers(num_iters):

    sumArr = np.zeros((1,512*512)) #initialize sum
    for index in range(num_iters):
        sumArr = sumArr + computation(index) #computation returns a 1 x 512^2 numpy array

    return sumArr

推薦答案

我想出了如何通過多處理、apply_async 和回調來并行化數組的總和，所以我在這里為其他人發布了這個.我使用并行 Python 的示例頁面作為 Sum 回調類，雖然我實際上并沒有使用那個包來實現.不過，它給了我使用回調的想法.這是我最終使用的簡化代碼，它完成了我想要它做的事情.

I figured out how to do parallelize a sum of arrays with multiprocessing, apply_async, and callbacks, so I'm posting this here for other people. I used the example page for Parallel Python for the Sum callback class, although I did not actually use that package for implementation. It gave me the idea of using callbacks, though. Here's the simplified code for what I ended up using, and it does what I wanted it to do.

import multiprocessing
import numpy as np
import thread

class Sum: #again, this class is from ParallelPython's example code (I modified for an array and added comments)
    def __init__(self):
        self.value = np.zeros((1,512*512)) #this is the initialization of the sum
        self.lock = thread.allocate_lock()
        self.count = 0

    def add(self,value):
        self.count += 1
        self.lock.acquire() #lock so sum is correct if two processes return at same time
        self.value += value #the actual summation
        self.lock.release()

def computation(index):
    array1 = np.ones((1,512*512))*index #this is where the array-returning computation goes
    return array1

def summers(num_iters):
    pool = multiprocessing.Pool(processes=8)

    sumArr = Sum() #create an instance of callback class and zero the sum
    for index in range(num_iters):
        singlepoolresult = pool.apply_async(computation,(index,),callback=sumArr.add)

    pool.close()
    pool.join() #waits for all the processes to finish

    return sumArr.value

我還能夠使用另一個答案中建議的并行化地圖來完成這項工作.我之前曾嘗試過，但我沒有正確實施.兩種方式都有效，我認為這個答案很好地解釋了使用哪種方法(map 或 apply.async)的問題.對于地圖版本，您不需要定義類 Sum，summers 函數變為

I was also able to get this working using a parallelized map, which was suggested in another answer. I had tried this earlier, but I wasn't implementing it correctly. Both ways work, and I think this answer explains the issue of which method to use (map or apply.async) pretty well. For the map version, you don't need to define the class Sum and the summers function becomes

def summers(num_iters):
    pool = multiprocessing.Pool(processes=8)

    outputArr = np.zeros((num_iters,1,512*512)) #you wouldn't have to initialize these
    sumArr = np.zeros((1,512*512))              #but I do to make sure I have the memory

    outputArr = np.array(pool.map(computation, range(num_iters)))
    sumArr = outputArr.sum(0)

    pool.close() #not sure if this is still needed since map waits for all iterations

    return sumArr

這篇關于python - 如何并行化python numpy中的總和計算?的文章就介紹到這了，希望我們推薦的答案對大家有所幫助，也希望大家多多支持html5模板網！

【網站聲明】本站部分內容來源于互聯網,旨在幫助大家更快的解決問題，如果有圖片或者內容侵犯了您的權益，請聯系我們刪除處理，感謝您的支持！

久久久久久久av_日韩在线中文_看一级毛片视频_日本精品二区_成人深夜福利视频_武道仙尊动漫在线观看

python - 如何并行化python numpy中的總和計算?

問題描述

推薦答案

相關文檔推薦