問題描述
我已經(jīng)閱讀了 文檔,但我不明白這是什么意思:延遲函數(shù)是一個簡單的技巧,可以使用函數(shù)調(diào)用語法創(chuàng)建元組(函數(shù)、args、kwargs).
I've read through the documentation, but I don't understand what is meant by:
The delayed function is a simple trick to be able to create a tuple (function, args, kwargs) with a function-call syntax.
我正在使用它來遍歷我想要操作的列表(allImages),如下所示:
I'm using it to iterate over the list I want to operate on (allImages) as follows:
def joblib_loop():
Parallel(n_jobs=8)(delayed(getHog)(i) for i in allImages)
這會返回我想要的 HOG 功能(并使用我所有的 8 個內(nèi)核來提高速度),但我只是不確定它實(shí)際上在做什么.
This returns my HOG features, like I want (and with the speed gain using all my 8 cores), but I'm just not sure what it is actually doing.
我的 Python 知識充其量還可以,但我很可能缺少一些基本知識.任何指向正確方向的指針將不勝感激
My Python knowledge is alright at best, and it's very possible that I'm missing something basic. Any pointers in the right direction would be most appreciated
推薦答案
如果我們看看如果我們簡單地寫會發(fā)生什么事情會變得更清楚
Perhaps things become clearer if we look at what would happen if instead we simply wrote
Parallel(n_jobs=8)(getHog(i) for i in allImages)
在這種情況下,可以更自然地表達(dá)為:
which, in this context, could be expressed more naturally as:
- 使用
n_jobs=8
創(chuàng)建一個 - 創(chuàng)建列表
[getHog(i) for i in allImages]
- 將該列表傳遞給
Parallel
實(shí)例
Parallel
實(shí)例- Create a
Parallel
instance withn_jobs=8
- create the list
[getHog(i) for i in allImages]
- pass that list to the
Parallel
instance
有什么問題?當(dāng)列表被傳遞給 Parallel
對象時,所有 getHog(i)
調(diào)用都已經(jīng)返回 - 所以沒有任何東西可以并行執(zhí)行!所有的工作都已經(jīng)在主線程中按順序完成了.
What's the problem? By the time the list gets passed to the Parallel
object, all getHog(i)
calls have already returned - so there's nothing left to execute in Parallel! All the work was already done in the main thread, sequentially.
我們實(shí)際上想要的是告訴Python我們想用什么參數(shù)調(diào)用什么函數(shù),沒有實(shí)際調(diào)用它們——換句話說,我們想要延遲執(zhí)行.
What we actually want is to tell Python what functions we want to call with what arguments, without actually calling them - in other words, we want to delay the execution.
這是 delayed
方便我們做的事情,語法清晰.如果我們想告訴 Python 我們想稍后調(diào)用 foo(2, g=3)
,我們可以簡單地寫成 delayed(foo)(2, g=3)代碼>.返回的是元組
(foo, [2], {g: 3})
,包含:
This is what delayed
conveniently allows us to do, with clear syntax. If we want to tell Python that we'd like to call foo(2, g=3)
sometime later, we can simply write delayed(foo)(2, g=3)
. Returned is the tuple (foo, [2], {g: 3})
, containing:
- 對我們要調(diào)用的函數(shù)的引用,例如
foo
- 所有參數(shù)(簡稱args")不帶關(guān)鍵字,例如
2
- 所有關(guān)鍵字參數(shù)(簡稱kwargs"),例如
g=3
- a reference to the function we want to call, e.g.
foo
- all arguments (short "args") without a keyword, e.g.t
2
- all keyword arguments (short "kwargs"), e.g.
g=3
因此,通過編寫 Parallel(n_jobs=8)(delayed(getHog)(i) for i in allImages)
,而不是上面的順序,現(xiàn)在會發(fā)生以下情況:
So, by writing Parallel(n_jobs=8)(delayed(getHog)(i) for i in allImages)
, instead of the above sequence, now the following happens:
創(chuàng)建了具有
n_jobs=8
的Parallel
實(shí)例
名單
[delayed(getHog)(i) for i in allImages]
被創(chuàng)建,評估為
[(getHog, [img1], {}), (getHog, [img2], {}), ... ]
該列表被傳遞給 Parallel
實(shí)例
Parallel
實(shí)例創(chuàng)建 8 個線程并將列表中的元組分配給它們
The Parallel
instance creates 8 threads and distributes the tuples from the list to them
最后,這些線程中的每一個都開始執(zhí)行元組,即,它們調(diào)用第一個元素,并將第二個和第三個元素解包為參數(shù) tup[0](*tup[1], **tup[2])
,將元組轉(zhuǎn)回我們真正想要做的調(diào)用,getHog(img2)
.
Finally, each of those threads starts executing the tuples, i.e., they call the first element with the second and the third elements unpacked as arguments tup[0](*tup[1], **tup[2])
, turning the tuple back into the call we actually intended to do, getHog(img2)
.
這篇關(guān)于delay() 函數(shù)有什么作用(在 Python 中與 joblib 一起使用時)的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網(wǎng)!