性培育学校羞耻椅子调教h,天堂中文字幕在线,国产成人精品不卡

本文介紹了使用 multiprocessing.Manager.list 而不是真正的列表會(huì)使計(jì)算花費(fèi)很長(zhǎng)時(shí)間的處理方法，對(duì)大家解決問(wèn)題具有一定的參考價(jià)值，需要的朋友們下面隨著小編來(lái)一起學(xué)習(xí)吧！

問(wèn)題描述

限時(shí)送ChatGPT賬號(hào)..

我想從這個(gè)例子開始嘗試使用 multiprocessing 的不同方式:

I wanted to try different ways of using multiprocessing starting with this example:

$ cat multi_bad.py 
import multiprocessing as mp
from time import sleep
from random import randint

def f(l, t):
#   sleep(30)
    return sum(x < t for x in l)

if __name__ == '__main__':
    l = [randint(1, 1000) for _ in range(25000)]
    t = [randint(1, 1000) for _ in range(4)]
#   sleep(15)
    pool = mp.Pool(processes=4)
    result = pool.starmap_async(f, [(l, x) for x in t])
    print(result.get())

這里，l 是一個(gè)列表，當(dāng)產(chǎn)生 4 個(gè)進(jìn)程時(shí)會(huì)被復(fù)制 4 次.為了避免這種情況，文檔頁(yè)面提供了使用隊(duì)列、共享數(shù)組或使用 multiprocessing.Manager 創(chuàng)建的代理對(duì)象.對(duì)于最后一個(gè)，我改變了l的定義:

Here, l is a list that gets copied 4 times when 4 processes are spawned. To avoid that, the documentation page offers using queues, shared arrays or proxy objects created using multiprocessing.Manager. For the last one, I changed the definition of l:

$ diff multi_bad.py multi_good.py 
10c10,11
<     l = [randint(1, 1000) for _ in range(25000)]
---
>     man = mp.Manager()
>     l = man.list([randint(1, 1000) for _ in range(25000)])

結(jié)果看起來(lái)仍然正確，但是執(zhí)行時(shí)間急劇增加，以至于我認(rèn)為我做錯(cuò)了什么:

The results still look correct, but the execution time has increased so dramatically that I think I'm doing something wrong:

$ time python multi_bad.py 
[17867, 11103, 2021, 17918]

real    0m0.247s
user    0m0.183s
sys 0m0.010s

$ time python multi_good.py 
[3609, 20277, 7799, 24262]

real    0m15.108s
user    0m28.092s
sys 0m6.320s

文檔確實(shí)說(shuō)這種方式比共享數(shù)組慢，但這感覺(jué)不對(duì).我也不確定如何對(duì)此進(jìn)行分析以獲取有關(guān)正在發(fā)生的事情的更多信息.我錯(cuò)過(guò)了什么嗎?

The docs do say that this way is slower than shared arrays, but this just feels wrong. I'm also not sure how I can profile this to get more information on what's going on. Am I missing something?

附:使用共享數(shù)組，我得到的時(shí)間低于 0.25 秒.

P.S. With shared arrays I get times below 0.25s.

附言這是在 Linux 和 Python 3.3 上.

P.P.S. This is on Linux and Python 3.3.

推薦答案

Linux 使用 copy-當(dāng)子進(jìn)程被 os.forked 時(shí)，on-write.演示:

Linux uses copy-on-write when subprocesses are os.forked. To demonstrate:

import multiprocessing as mp
import numpy as np
import logging
import os

logger = mp.log_to_stderr(logging.WARNING)

def free_memory():
    total = 0
    with open('/proc/meminfo', 'r') as f:
        for line in f:
            line = line.strip()
            if any(line.startswith(field) for field in ('MemFree', 'Buffers', 'Cached')):
                field, amount, unit = line.split()
                amount = int(amount)
                if unit != 'kB':
                    raise ValueError(
                        'Unknown unit {u!r} in /proc/meminfo'.format(u = unit))
                total += amount
    return total

def worker(i):
    x = data[i,:].sum()    # Exercise access to data
    logger.warn('Free memory: {m}'.format(m = free_memory()))

def main():
    procs = [mp.Process(target = worker, args = (i, )) for i in range(4)]
    for proc in procs:
        proc.start()
    for proc in procs:
        proc.join()

logger.warn('Initial free: {m}'.format(m = free_memory()))
N = 15000
data = np.ones((N,N))
logger.warn('After allocating data: {m}'.format(m = free_memory()))

if __name__ == '__main__':
    main()

產(chǎn)生了

[WARNING/MainProcess] Initial free: 2522340
[WARNING/MainProcess] After allocating data: 763248
[WARNING/Process-1] Free memory: 760852
[WARNING/Process-2] Free memory: 757652
[WARNING/Process-3] Free memory: 757264
[WARNING/Process-4] Free memory: 756760

這表明最初大約有 2.5GB 的可用內(nèi)存.在分配 15000x15000 的 float64 數(shù)組后，有 763248 KB 可用空間.這大致是有道理的，因?yàn)?15000**2*8 字節(jié) = 1.8GB 并且內(nèi)存下降，2.5GB - 0.763248GB 也大約是 1.8GB.

This shows that initially there was roughly 2.5GB of free memory. After allocating a 15000x15000 array of float64s, there was 763248 KB free. This roughly makes sense since 15000**2*8 bytes = 1.8GB and the drop in memory, 2.5GB - 0.763248GB is also roughly 1.8GB.

現(xiàn)在每個(gè)進(jìn)程生成后，可用內(nèi)存再次報(bào)告為 ~750MB.可用內(nèi)存沒(méi)有顯著減少，因此我認(rèn)為系統(tǒng)必須使用寫時(shí)復(fù)制.

Now after each process is spawned, the free memory is again reported to be ~750MB. There is no significant decrease in free memory, so I conclude the system must be using copy-on-write.

結(jié)論:如果您不需要修改數(shù)據(jù)，在 __main__ 模塊的全局級(jí)別定義它是一種方便且(至少在 Linux 上)內(nèi)存友好的方式來(lái)共享它子進(jìn)程.

Conclusion: If you do not need to modify the data, defining it at the global level of the __main__ module is a convenient and (at least on Linux) memory-friendly way to share it among subprocesses.

這篇關(guān)于使用 multiprocessing.Manager.list 而不是真正的列表會(huì)使計(jì)算花費(fèi)很長(zhǎng)時(shí)間的文章就介紹到這了，希望我們推薦的答案對(duì)大家有所幫助，也希望大家多多支持html5模板網(wǎng)！

【網(wǎng)站聲明】本站部分內(nèi)容來(lái)源于互聯(lián)網(wǎng),旨在幫助大家更快的解決問(wèn)題，如果有圖片或者內(nèi)容侵犯了您的權(quán)益，請(qǐng)聯(lián)系我們刪除處理，感謝您的支持！

久久久久久久av_日韩在线中文_看一级毛片视频_日本精品二区_成人深夜福利视频_武道仙尊动漫在线观看

使用 multiprocessing.Manager.list 而不是真正的列表會(huì)

問(wèn)題描述

推薦答案

相關(guān)文檔推薦