久久久久久久av_日韩在线中文_看一级毛片视频_日本精品二区_成人深夜福利视频_武道仙尊动漫在线观看

  • <tfoot id='Kzzhq'></tfoot>
  • <small id='Kzzhq'></small><noframes id='Kzzhq'>

    • <bdo id='Kzzhq'></bdo><ul id='Kzzhq'></ul>

    <legend id='Kzzhq'><style id='Kzzhq'><dir id='Kzzhq'><q id='Kzzhq'></q></dir></style></legend>
      <i id='Kzzhq'><tr id='Kzzhq'><dt id='Kzzhq'><q id='Kzzhq'><span id='Kzzhq'><b id='Kzzhq'><form id='Kzzhq'><ins id='Kzzhq'></ins><ul id='Kzzhq'></ul><sub id='Kzzhq'></sub></form><legend id='Kzzhq'></legend><bdo id='Kzzhq'><pre id='Kzzhq'><center id='Kzzhq'></center></pre></bdo></b><th id='Kzzhq'></th></span></q></dt></tr></i><div class="qwawimqqmiuu" id='Kzzhq'><tfoot id='Kzzhq'></tfoot><dl id='Kzzhq'><fieldset id='Kzzhq'></fieldset></dl></div>
      1. 使用多處理讀取多個文件

        read multiple files using multiprocessing(使用多處理讀取多個文件)
        • <bdo id='T5Qy2'></bdo><ul id='T5Qy2'></ul>
          <i id='T5Qy2'><tr id='T5Qy2'><dt id='T5Qy2'><q id='T5Qy2'><span id='T5Qy2'><b id='T5Qy2'><form id='T5Qy2'><ins id='T5Qy2'></ins><ul id='T5Qy2'></ul><sub id='T5Qy2'></sub></form><legend id='T5Qy2'></legend><bdo id='T5Qy2'><pre id='T5Qy2'><center id='T5Qy2'></center></pre></bdo></b><th id='T5Qy2'></th></span></q></dt></tr></i><div class="qwawimqqmiuu" id='T5Qy2'><tfoot id='T5Qy2'></tfoot><dl id='T5Qy2'><fieldset id='T5Qy2'></fieldset></dl></div>

            <legend id='T5Qy2'><style id='T5Qy2'><dir id='T5Qy2'><q id='T5Qy2'></q></dir></style></legend>

          1. <tfoot id='T5Qy2'></tfoot>
              <tbody id='T5Qy2'></tbody>

              <small id='T5Qy2'></small><noframes id='T5Qy2'>

                  本文介紹了使用多處理讀取多個文件的處理方法,對大家解決問題具有一定的參考價值,需要的朋友們下面隨著小編來一起學習吧!

                  問題描述

                  限時送ChatGPT賬號..

                  我需要閱讀一些非常大的文本文件(100+ Mb),用正則表達式處理每一行并將數據存儲到一個結構中.我的結構繼承自 defaultdict,它有一個讀取 self.file_name 文件的 read(self) 方法.

                  I need to read some very huge text files (100+ Mb), process every lines with regex and store the data into a structure. My structure inherits from defaultdict, it has a read(self) method that read self.file_name file.

                  看這個非常簡單(但不是真實的)示例,我沒有使用正則表達式,但我正在拆分行:

                  Look at this very simple (but not real) example, I'm not using regex, but I'm splitting lines:

                  
                  import multiprocessing
                  from collections import defaultdict
                  
                  def SingleContainer():
                      return list()
                  
                  class Container(defaultdict):
                      """
                      this class store odd line in self["odd"] and even line in self["even"].
                      It is stupid, but it's only an example. In the real case the class
                      has additional methods that do computation on readen data.
                      """
                      def __init__(self,file_name):
                          if type(file_name) != str:
                              raise AttributeError, "%s is not a string" % file_name
                          defaultdict.__init__(self,SingleContainer)
                          self.file_name = file_name
                          self.readen_lines = 0
                      def read(self):
                          f = open(self.file_name)
                          print "start reading file %s" % self.file_name
                          for line in f:
                              self.readen_lines += 1
                              values = line.split()
                              key = {0: "even", 1: "odd"}[self.readen_lines %2]
                              self[key].append(values)
                          print "readen %d lines from file %s" % (self.readen_lines, self.file_name)
                  
                  def do(file_name):
                      container = Container(file_name)
                      container.read()
                      return container.items()
                  
                  if __name__ == "__main__":
                      file_names = ["r1_200909.log", "r1_200910.log"]
                      pool = multiprocessing.Pool(len(file_names))
                      result = pool.map(do,file_names)
                      pool.close()
                      pool.join()
                      print "Finish"      
                  

                  最后,我需要將每個結果加入一個容器中.保持行的順序很重要.返回值時我的方法太慢了.更好的解決方案?我在 Linux 上使用 python 2.6

                  At the end I need to join every results in a single Container. It is important that the order of the lines is preserved. My approach is too slow when returning values. Better solution? I'm using python 2.6 on Linux

                  推薦答案

                  你可能遇到了兩個問題.

                  You're probably hitting two problems.

                  提到了其中一個:您正在同時讀取多個文件.這些讀取最終會被交錯,導致磁盤抖動.您想一次讀取整個文件,然后只對數據進行多線程計算.

                  One of them was mentioned: you're reading multiple files at once. Those reads will end up being interleaved, causing disk thrashing. You want to read whole files at once, and then only multithread the computation on the data.

                  其次,您遇到了 Python 的多處理模塊的開銷.它實際上不是使用線程,而是啟動多個進程并通過管道序列化結果.這對于批量數據來說非常慢——事實上,它似乎比您在線程中所做的工作要慢(至少在示例中).這是由 GIL 引起的現實問題.

                  Second, you're hitting the overhead of Python's multiprocessing module. It's not actually using threads, but instead starting multiple processes and serializing the results through a pipe. That's very slow for bulk data--in fact, it seems to be slower than the work you're doing in the thread (at least in the example). This is the real-world problem caused by the GIL.

                  如果我修改 do() 以返回 None 而不是 container.items() 以禁用額外的數據復制,則此示例 比單個線程快,只要文件已被緩存:

                  If I modify do() to return None instead of container.items() to disable the extra data copy, this example is faster than a single thread, as long as the files are already cached:

                  兩個線程:0.36elapsed 168%CPU

                  Two threads: 0.36elapsed 168%CPU

                  一個線程(用map替換pool.map):0:00.52elapsed 98%CPU

                  One thread (replace pool.map with map): 0:00.52elapsed 98%CPU

                  不幸的是,GIL 問題是根本性的,無法從 Python 內部解決.

                  Unfortunately, the GIL problem is fundamental and can't be worked around from inside Python.

                  這篇關于使用多處理讀取多個文件的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!

                  【網站聲明】本站部分內容來源于互聯網,旨在幫助大家更快的解決問題,如果有圖片或者內容侵犯了您的權益,請聯系我們刪除處理,感謝您的支持!

                  相關文檔推薦

                  What exactly is Python multiprocessing Module#39;s .join() Method Doing?(Python 多處理模塊的 .join() 方法到底在做什么?)
                  Passing multiple parameters to pool.map() function in Python(在 Python 中將多個參數傳遞給 pool.map() 函數)
                  multiprocessing.pool.MaybeEncodingError: #39;TypeError(quot;cannot serialize #39;_io.BufferedReader#39; objectquot;,)#39;(multiprocessing.pool.MaybeEncodingError: TypeError(cannot serialize _io.BufferedReader object,)) - IT屋-程序員軟件開
                  Python Multiprocess Pool. How to exit the script when one of the worker process determines no more work needs to be done?(Python 多進程池.當其中一個工作進程確定不再需要完成工作時,如何退出腳本?) - IT屋-程序員
                  How do you pass a Queue reference to a function managed by pool.map_async()?(如何將隊列引用傳遞給 pool.map_async() 管理的函數?)
                  yet another confusion with multiprocessing error, #39;module#39; object has no attribute #39;f#39;(與多處理錯誤的另一個混淆,“模塊對象沒有屬性“f)
                    <legend id='DkLUE'><style id='DkLUE'><dir id='DkLUE'><q id='DkLUE'></q></dir></style></legend>

                      <tbody id='DkLUE'></tbody>
                    <i id='DkLUE'><tr id='DkLUE'><dt id='DkLUE'><q id='DkLUE'><span id='DkLUE'><b id='DkLUE'><form id='DkLUE'><ins id='DkLUE'></ins><ul id='DkLUE'></ul><sub id='DkLUE'></sub></form><legend id='DkLUE'></legend><bdo id='DkLUE'><pre id='DkLUE'><center id='DkLUE'></center></pre></bdo></b><th id='DkLUE'></th></span></q></dt></tr></i><div class="qwawimqqmiuu" id='DkLUE'><tfoot id='DkLUE'></tfoot><dl id='DkLUE'><fieldset id='DkLUE'></fieldset></dl></div>

                      1. <small id='DkLUE'></small><noframes id='DkLUE'>

                          <bdo id='DkLUE'></bdo><ul id='DkLUE'></ul>
                            <tfoot id='DkLUE'></tfoot>
                          • 主站蜘蛛池模板: 9久9久| 亚洲国产视频一区 | 一级毛片在线播放 | 亚洲网一区 | 日韩中文视频 | 国产欧美一区二区三区免费 | 手机在线一区二区三区 | 国产清纯白嫩初高生视频在线观看 | 精国产品一区二区三区四季综 | 在线观看日本网站 | 中文字字幕一区二区三区四区五区 | 欧美一区不卡 | 欧美成人手机视频 | 日韩欧美在线观看一区 | 免费在线观看黄视频 | 国产美女在线观看 | 欧美亚洲日本 | 久久国产成人精品国产成人亚洲 | 在线免费观看黄a | av网站免费在线观看 | 日韩av在线不卡 | 国产视频精品在线观看 | 成人欧美一区二区三区 | 日韩精品一区二区三区 | 中文字幕在线一区 | 欧美一区免费 | 成人福利网站 | 国产亚洲精品精品国产亚洲综合 | 国产精品91视频 | 久久成人在线视频 | 91精品国产乱码麻豆白嫩 | 成人午夜在线视频 | 国产精品伦一区二区三级视频 | 免费看国产片在线观看 | 狠狠艹 | 亚洲免费在线 | 欧美三区视频 | 成人免费小视频 | 国产精品免费在线 | 久久久久久高潮国产精品视 | 丝袜久久 |