問題描述
在 windows 上的 python 中使用多處理時,希望保護程序的入口點.文檔說確保新的 Python 解釋器可以安全地導入主模塊,而不會導致意外的副作用(例如啟動新進程)".誰能解釋一下這到底是什么意思?
While using multiprocessing in python on windows, it is expected to protect the entry point of the program. The documentation says "Make sure that the main module can be safely imported by a new Python interpreter without causing unintended side effects (such a starting a new process)". Can anyone explain what exactly does this mean ?
推薦答案
擴展一下你已經得到的好答案,如果你了解 Linux 系統做什么,它會有所幫助.它們使用 fork()
生成新進程,這有兩個好 后果:
Expanding a bit on the good answer you already got, it helps if you understand what Linux-y systems do. They spawn new processes using fork()
, which has two good consequences:
- 主程序中存在的所有數據結構對子進程都是可見的.他們實際上是處理數據的副本.
- 子進程在主程序中緊跟
fork()
的指令處開始執行 - 因此任何已在模塊中執行的模塊級代碼都不會再次執行.
- All data structures existing in the main program are visible to the child processes. They actually work on copies of the data.
- The child processes start executing at the instruction immediately following the
fork()
in the main program - so any module-level code already executed in the module will not be executed again.
fork()
在 Windows 中是不可能的,因此在 Windows 上,每個模塊都由每個子進程重新導入.所以:
fork()
isn't possible in Windows, so on Windows each module is imported anew by each child process. So:
- 在 Windows 上,no 存在于主程序中的數據結構對子進程是可見的;并且,
- 所有模塊級代碼在每個子進程中執行.
- On Windows, no data structures existing in the main program are visible to the child processes; and,
- All module-level code is executed in each child process.
所以你需要考慮一下想要只在主程序中執行的代碼.最明顯的例子是,您希望創建子進程的代碼僅在主程序中運行——因此應該受到 __name__ == '__main__'
的保護.舉一個更微妙的例子,考慮構建一個巨大列表的代碼,您打算將其傳遞給工作進程以進行爬網.您可能也想保護它,因為在這種情況下,讓每個工作進程浪費 RAM 和時間來構建自己的巨大列表的無用副本是沒有意義的.
So you need to think a bit about which code you want executed only in the main program. The most obvious example is that you want code that creates child processes to run only in the main program - so that should be protected by __name__ == '__main__'
. For a subtler example, consider code that builds a gigantic list, which you intend to pass out to worker processes to crawl over. You probably want to protect that too, because there's no point in this case to make each worker process waste RAM and time building their own useless copies of the gigantic list.
請注意,即使在 Linux 系統上適當地使用 __name__ == "__main__"
也是一個好主意,因為它使預期的工作分工更清晰.并行程序可能會令人困惑——每一點都有幫助 ;-)
Note that it's a Good Idea to use __name__ == "__main__"
appropriately even on Linux-y systems, because it makes the intended division of work clearer. Parallel programs can be confusing - every little bit helps ;-)
這篇關于強制使用 if __name__=="__main__"在 Windows 中同時使用多處理的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!