久久久久久久av_日韩在线中文_看一级毛片视频_日本精品二区_成人深夜福利视频_武道仙尊动漫在线观看

      <tfoot id='y9maD'></tfoot>

        <legend id='y9maD'><style id='y9maD'><dir id='y9maD'><q id='y9maD'></q></dir></style></legend>

        <i id='y9maD'><tr id='y9maD'><dt id='y9maD'><q id='y9maD'><span id='y9maD'><b id='y9maD'><form id='y9maD'><ins id='y9maD'></ins><ul id='y9maD'></ul><sub id='y9maD'></sub></form><legend id='y9maD'></legend><bdo id='y9maD'><pre id='y9maD'><center id='y9maD'></center></pre></bdo></b><th id='y9maD'></th></span></q></dt></tr></i><div class="qwawimqqmiuu" id='y9maD'><tfoot id='y9maD'></tfoot><dl id='y9maD'><fieldset id='y9maD'></fieldset></dl></div>
          <bdo id='y9maD'></bdo><ul id='y9maD'></ul>

        <small id='y9maD'></small><noframes id='y9maD'>

        在執行 I/O 密集型任務時,20 個進程中的 400 個線

        400 threads in 20 processes outperform 400 threads in 4 processes while performing an I/O-bound task(在執行 I/O 密集型任務時,20 個進程中的 400 個線程優于 4 個進程中的 400 個線程) - IT屋-程序員軟件開發技術

          <small id='l3epJ'></small><noframes id='l3epJ'>

        1. <tfoot id='l3epJ'></tfoot>
          <i id='l3epJ'><tr id='l3epJ'><dt id='l3epJ'><q id='l3epJ'><span id='l3epJ'><b id='l3epJ'><form id='l3epJ'><ins id='l3epJ'></ins><ul id='l3epJ'></ul><sub id='l3epJ'></sub></form><legend id='l3epJ'></legend><bdo id='l3epJ'><pre id='l3epJ'><center id='l3epJ'></center></pre></bdo></b><th id='l3epJ'></th></span></q></dt></tr></i><div class="qwawimqqmiuu" id='l3epJ'><tfoot id='l3epJ'></tfoot><dl id='l3epJ'><fieldset id='l3epJ'></fieldset></dl></div>
              <bdo id='l3epJ'></bdo><ul id='l3epJ'></ul>

                  <tbody id='l3epJ'></tbody>
                <legend id='l3epJ'><style id='l3epJ'><dir id='l3epJ'><q id='l3epJ'></q></dir></style></legend>

                  本文介紹了在執行 I/O 密集型任務時,20 個進程中的 400 個線程優于 4 個進程中的 400 個線程的處理方法,對大家解決問題具有一定的參考價值,需要的朋友們下面隨著小編來一起學習吧!

                  問題描述

                  限時送ChatGPT賬號..

                  下面是實驗代碼,它可以啟動指定數量的工作進程,然后在每個進程內啟動指定數量的工作線程,并執行獲取 URL 的任務:

                  Here is the experimental code that can launch a specified number of worker processes and then launch a specified number of worker threads within each process and perform the task of fetching URLs:

                  import multiprocessing
                  import sys
                  import time
                  import threading
                  import urllib.request
                  
                  
                  def main():
                      processes = int(sys.argv[1])
                      threads = int(sys.argv[2])
                      urls = int(sys.argv[3])
                  
                      # Start process workers.
                      in_q = multiprocessing.Queue()
                      process_workers = []
                      for _ in range(processes):
                          w = multiprocessing.Process(target=process_worker, args=(threads, in_q))
                          w.start()
                          process_workers.append(w)
                  
                      start_time = time.time()
                  
                      # Feed work.
                      for n in range(urls):
                          in_q.put('http://www.example.com/?n={}'.format(n))
                  
                      # Send sentinel for each thread worker to quit.
                      for _ in range(processes * threads):
                          in_q.put(None)
                  
                      # Wait for workers to terminate.
                      for w in process_workers:
                          w.join()
                  
                      # Print time consumed and fetch speed.
                      total_time = time.time() - start_time
                      fetch_speed = urls / total_time
                      print('{} x {} workers => {:.3} s, {:.1f} URLs/s'
                            .format(processes, threads, total_time, fetch_speed))
                  
                  
                  
                  def process_worker(threads, in_q):
                      # Start thread workers.
                      thread_workers = []
                      for _ in range(threads):
                          w = threading.Thread(target=thread_worker, args=(in_q,))
                          w.start()
                          thread_workers.append(w)
                  
                      # Wait for thread workers to terminate.
                      for w in thread_workers:
                          w.join()
                  
                  
                  def thread_worker(in_q):
                      # Each thread performs the actual work. In this case, we will assume
                      # that the work is to fetch a given URL.
                      while True:
                          url = in_q.get()
                          if url is None:
                              break
                  
                          with urllib.request.urlopen(url) as u:
                              pass # Do nothing
                              # print('{} - {} {}'.format(url, u.getcode(), u.reason))
                  
                  
                  if __name__ == '__main__':
                      main()
                  

                  這是我運行這個程序的方式:

                  Here is how I run this program:

                  python3 foo.py <PROCESSES> <THREADS> <URLS>
                  

                  例如,python3 foo.py 20 20 10000 創建 20 個工作進程,每個工作進程中有 20 個線程(因此總共有 400 個工作線程)并獲取 10000 個 URL.最后,這個程序會打印出獲取 URL 所花費的時間以及平均每秒獲取多少個 URL.

                  For example, python3 foo.py 20 20 10000 creates 20 worker processes with 20 threads in each worker process (thus a total of 400 worker threads) and fetches 10000 URLs. In the end, this program prints how much time it took to fetch the URLs and how many URLs it fetched per second on an average.

                  請注意,在所有情況下,我都會點擊 www.example.com 域的 URL,即 www.example.com 不僅僅是一個占位符.換句話說,我在未修改的情況下運行上述代碼.

                  Note that in all cases I am really hitting a URL of www.example.com domain, i.e., www.example.com is not merely a placeholder. In other words, I run the above code unmodified.

                  我正在一個具有 8 GB RAM 和 4 個 CPU 的 Linode 虛擬專用服務器上測試此代碼.它正在運行 Debian 9.

                  I am testing this code on a Linode virtual private server that has 8 GB RAM and 4 CPUs. It is running Debian 9.

                  $ cat /etc/debian_version 
                  9.9
                  
                  $ python3
                  Python 3.5.3 (default, Sep 27 2018, 17:25:39) 
                  [GCC 6.3.0 20170516] on linux
                  Type "help", "copyright", "credits" or "license" for more information.
                  >>> 
                  
                  $ free -m
                                total        used        free      shared  buff/cache   available
                  Mem:           7987          67        7834          10          85        7734
                  Swap:           511           0         511
                  
                  $ nproc
                  4
                  

                  案例 1:20 個進程 x 20 個線程

                  這里有一些試運行,其中 400 個工作線程分布在 20 個工作進程之間(即 20 個工作進程中的每個工作進程有 20 個工作線程).在每次試驗中,會提取 10,000 個 URL.

                  Case 1: 20 Processes x 20 Threads

                  Here are a few trial runs with 400 worker threads distributed between 20 worker processes (i.e., 20 worker threads in each of the 20 worker processes). In each trial, 10,000 URLs are fetched.

                  結果如下:

                  $ python3 foo.py 20 20 10000
                  20 x 20 workers => 5.12 s, 1954.6 URLs/s
                  
                  $ python3 foo.py 20 20 10000
                  20 x 20 workers => 5.28 s, 1895.5 URLs/s
                  
                  $ python3 foo.py 20 20 10000
                  20 x 20 workers => 5.22 s, 1914.2 URLs/s
                  
                  $ python3 foo.py 20 20 10000
                  20 x 20 workers => 5.38 s, 1859.8 URLs/s
                  
                  $ python3 foo.py 20 20 10000
                  20 x 20 workers => 5.19 s, 1925.2 URLs/s
                  

                  我們可以看到平均每秒獲取大約 1900 個 URL.當我使用 top 命令監控 CPU 使用率時,我看到每個 python3 工作進程消耗大約 10% 到 15% 的 CPU.

                  We can see that about 1900 URLs are fetched per second on an average. When I monitor the CPU usage with the top command, I see that each python3 worker process consumes about 10% to 15% CPU.

                  現在我以為我只有 4 個 CPU.即使我啟動 20 個工作進程,在物理時間的任何時間點最多也只有 4 個進程可以運行.此外,由于全局解釋器鎖 (GIL),每個進程中只有一個線程(因此最多總共 4 個線程)可以在物理時間的任何點運行.

                  Now I thought that I only have 4 CPUs. Even if I launch 20 worker processes, at most only 4 processes can run at any point in physical time. Further due to global interpreter lock (GIL), only one thread in each process (thus a total of 4 threads at most) can run at any point in physical time.

                  因此,我想如果我將進程數減少到4個,并將每個進程的線程數增加到100個,這樣總線程數仍然保持在400個,性能應該不會變差.

                  Therefore, I thought if I reduce the number of processes to 4 and increase the number of threads per process to 100, so that the total number of threads still remain 400, the performance should not deteriorate.

                  但測試結果表明,每個包含 100 個線程的 4 個進程的性能始終比每個包含 20 個線程的 20 個進程差.

                  But the test results show that 4 processes containing 100 threads each consistently perform worse than 20 processes containing 20 threads each.

                  $ python3 foo.py 4 100 10000
                  4 x 100 workers => 9.2 s, 1086.4 URLs/s
                  
                  $ python3 foo.py 4 100 10000
                  4 x 100 workers => 10.9 s, 916.5 URLs/s
                  
                  $ python3 foo.py 4 100 10000
                  4 x 100 workers => 7.8 s, 1282.2 URLs/s
                  
                  $ python3 foo.py 4 100 10000
                  4 x 100 workers => 10.3 s, 972.3 URLs/s
                  
                  $ python3 foo.py 4 100 10000
                  4 x 100 workers => 6.37 s, 1570.9 URLs/s
                  

                  每個 python3 工作進程的 CPU 使用率在 40% 到 60% 之間.

                  The CPU usage is between 40% to 60% for each python3 worker process.

                  只是為了比較,我記錄了一個事實,即案例 1 和案例 2 都優于我們在單個進程中擁有所有 400 個線程的情況.這肯定是由于全局解釋器鎖 (GIL).

                  Just for comparison, I am recording the fact that both case 1 and case 2 outperform the case where we have all 400 threads in a single process. This is most certainly due to the global interpreter lock (GIL).

                  $ python3 foo.py 1 400 10000
                  1 x 400 workers => 13.5 s, 742.8 URLs/s
                  
                  $ python3 foo.py 1 400 10000
                  1 x 400 workers => 14.3 s, 697.5 URLs/s
                  
                  $ python3 foo.py 1 400 10000
                  1 x 400 workers => 13.1 s, 761.3 URLs/s
                  
                  $ python3 foo.py 1 400 10000
                  1 x 400 workers => 15.6 s, 640.4 URLs/s
                  
                  $ python3 foo.py 1 400 10000
                  1 x 400 workers => 13.1 s, 764.4 URLs/s
                  

                  單個 python3 工作進程的 CPU 使用率介于 120% 和 125% 之間.

                  The CPU usage is between 120% and 125% for the single python3 worker process.

                  再次,只是為了比較,這里是當有 400 個進程時的結果,每個進程都有一個線程.

                  Again, just for comparison, here is how the results look when there are 400 processes, each with a single thread.

                  $ python3 foo.py 400 1 10000
                  400 x 1 workers => 14.0 s, 715.0 URLs/s
                  
                  $ python3 foo.py 400 1 10000
                  400 x 1 workers => 6.1 s, 1638.9 URLs/s
                  
                  $ python3 foo.py 400 1 10000
                  400 x 1 workers => 7.08 s, 1413.1 URLs/s
                  
                  $ python3 foo.py 400 1 10000
                  400 x 1 workers => 7.23 s, 1382.9 URLs/s
                  
                  $ python3 foo.py 400 1 10000
                  400 x 1 workers => 11.3 s, 882.9 URLs/s
                  

                  每個 python3 工作進程的 CPU 使用率在 1% 到 3% 之間.

                  The CPU usage is between 1% to 3% for each python3 worker process.

                  從每個案例中選取中值結果,我們得到以下摘要:

                  Picking the median result from each case, we get this summary:

                  Case 1:  20 x  20 workers => 5.22 s, 1914.2 URLs/s ( 10% to  15% CPU/process)
                  Case 2:   4 x 100 workers => 9.20 s, 1086.4 URLs/s ( 40% to  60% CPU/process)
                  Case 3:   1 x 400 workers => 13.5 s,  742.8 URLs/s (120% to 125% CPU/process)
                  Case 4: 400 x   1 workers => 7.23 s, 1382.9 URLs/s (  1% to   3% CPU/process
                  

                  問題

                  為什么即使我只有 4 個 CPU,20 進程 x 20 線程的性能也比 4 進程 x 100 線程好?

                  Question

                  Why does 20 processes x 20 threads perform better than 4 processes x 100 threads even if I have only 4 CPUs?

                  推薦答案

                  你的任務是 I/O-bound 而不是 CPU-bound:線程大部分時間都在睡眠狀態等待網絡數據等,而不是使用中央處理器.

                  Your task is I/O-bound rather than CPU-bound: threads spend most of the time in sleep state waiting for network data and such rather than using the CPU.

                  因此,只要 I/O 仍然是瓶頸,添加比 CPU 更多的線程就可以工作.只有當線程太多以至于有足夠多的線程準備好開始積極競爭 CPU 周期時(或當您的網絡帶寬耗盡時,以先到者為準),這種影響才會消退.

                  So adding more threads than CPUs works here as long as I/O is still the bottleneck. The effect will only subside once there are so many threads that enough of them are ready at a time to start actively competing for CPU cycles (or when your network bandwidth is exhausted, whichever comes first).

                  至于為什么每個進程 20 個線程比每個進程 100 個線程快:這很可能是由于 CPython 的 GIL.同一進程中的 Python 線程不僅需要等待 I/O,還需要相互等待.
                  在處理 I/O 時,Python 機器:

                  As for why 20 threads per process is faster than 100 threads per process: this is most likely due to CPython's GIL. Python threads in the same process need to wait not only for I/O but for each other, too.
                  When dealing with I/O, Python machinery:

                  1. 將所有涉及的 Python 對象轉換為 C 對象(在許多情況下,無需物理復制數據即可完成)
                  2. 發布 GIL
                  3. 在 C 中執行 I/O(包括等待任意時間)
                  4. 重新獲得 GIL
                  5. 將結果轉換為 Python 對象(如果適用)

                  如果同一個進程中有足夠多的線程,那么當到達第 4 步時,另一個線程很可能處于活動狀態,從而導致額外的隨機延遲.

                  If there are enough threads in the same process, it becomes increasigly likely that another one is active when step 4 is reached, causing an additional random delay.

                  現在,當涉及到大量進程時,其他因素也會發揮作用,例如內存交換(因為與線程不同,運行相同代碼的進程不共享內存)(我很確定還有其他延遲來自進程而不是線程競爭資源,但不能從我的頭頂指出).這就是性能變得不穩定的原因.

                  Now, when it comes to lots of processes, other factors come into play like memory swapping (since unlike threads, processes running the same code don't share memory) (I'm pretty sure there are other delays from lots of processes as opposed to threads competing for resources but can't point it from the top of my head). That's why the performance becomes unstable.

                  這篇關于在執行 I/O 密集型任務時,20 個進程中的 400 個線程優于 4 個進程中的 400 個線程的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!

                  【網站聲明】本站部分內容來源于互聯網,旨在幫助大家更快的解決問題,如果有圖片或者內容侵犯了您的權益,請聯系我們刪除處理,感謝您的支持!

                  相關文檔推薦

                  What exactly is Python multiprocessing Module#39;s .join() Method Doing?(Python 多處理模塊的 .join() 方法到底在做什么?)
                  Passing multiple parameters to pool.map() function in Python(在 Python 中將多個參數傳遞給 pool.map() 函數)
                  multiprocessing.pool.MaybeEncodingError: #39;TypeError(quot;cannot serialize #39;_io.BufferedReader#39; objectquot;,)#39;(multiprocessing.pool.MaybeEncodingError: TypeError(cannot serialize _io.BufferedReader object,)) - IT屋-程序員軟件開
                  Python Multiprocess Pool. How to exit the script when one of the worker process determines no more work needs to be done?(Python 多進程池.當其中一個工作進程確定不再需要完成工作時,如何退出腳本?) - IT屋-程序員
                  How do you pass a Queue reference to a function managed by pool.map_async()?(如何將隊列引用傳遞給 pool.map_async() 管理的函數?)
                  yet another confusion with multiprocessing error, #39;module#39; object has no attribute #39;f#39;(與多處理錯誤的另一個混淆,“模塊對象沒有屬性“f)
                  <tfoot id='D6kfp'></tfoot>

                    <small id='D6kfp'></small><noframes id='D6kfp'>

                  • <i id='D6kfp'><tr id='D6kfp'><dt id='D6kfp'><q id='D6kfp'><span id='D6kfp'><b id='D6kfp'><form id='D6kfp'><ins id='D6kfp'></ins><ul id='D6kfp'></ul><sub id='D6kfp'></sub></form><legend id='D6kfp'></legend><bdo id='D6kfp'><pre id='D6kfp'><center id='D6kfp'></center></pre></bdo></b><th id='D6kfp'></th></span></q></dt></tr></i><div class="qwawimqqmiuu" id='D6kfp'><tfoot id='D6kfp'></tfoot><dl id='D6kfp'><fieldset id='D6kfp'></fieldset></dl></div>

                            <bdo id='D6kfp'></bdo><ul id='D6kfp'></ul>

                              <tbody id='D6kfp'></tbody>
                            <legend id='D6kfp'><style id='D6kfp'><dir id='D6kfp'><q id='D6kfp'></q></dir></style></legend>
                          • 主站蜘蛛池模板: 国产福利91精品一区二区三区 | 久久99精品久久久久久国产越南 | 国产精品久久久 | 精品国产免费人成在线观看 | 亚洲 欧美 日韩 在线 | 国产免费观看视频 | 91精品国产乱码麻豆白嫩 | 日韩在线免费 | 欧美精品一区在线 | 激情欧美日韩一区二区 | 欧美色综合天天久久综合精品 | 91高清在线观看 | 欧美一区二区三区在线免费观看 | 97伦理电影| 欧美精品在线免费观看 | 国产一区二区三区在线 | 免费日韩网站 | 国产精品久久久久久久久图文区 | 国产在线一 | 精品国产一区二区久久 | 久久久tv| 天天久久| 日韩av高清 | 伊人伊人| 欧美久久久网站 | 欧美日韩久久 | 日韩av高清 | 九九综合 | 在线日韩不卡 | 日韩av福利在线观看 | 羞羞的视频免费观看 | 91免费福利视频 | 日韩欧美一区二区三区免费观看 | 国产色网站 | 狠狠干网站 | 色综合99 | 性做久久久久久免费观看欧美 | 日韩毛片在线视频 | 久久久久国产视频 | 欧美操操操 | 国产美女在线观看 |