久久久久久久av_日韩在线中文_看一级毛片视频_日本精品二区_成人深夜福利视频_武道仙尊动漫在线观看

  • <i id='GSiS6'><tr id='GSiS6'><dt id='GSiS6'><q id='GSiS6'><span id='GSiS6'><b id='GSiS6'><form id='GSiS6'><ins id='GSiS6'></ins><ul id='GSiS6'></ul><sub id='GSiS6'></sub></form><legend id='GSiS6'></legend><bdo id='GSiS6'><pre id='GSiS6'><center id='GSiS6'></center></pre></bdo></b><th id='GSiS6'></th></span></q></dt></tr></i><div class="qwawimqqmiuu" id='GSiS6'><tfoot id='GSiS6'></tfoot><dl id='GSiS6'><fieldset id='GSiS6'></fieldset></dl></div>

    <tfoot id='GSiS6'></tfoot>
      <bdo id='GSiS6'></bdo><ul id='GSiS6'></ul>

  • <small id='GSiS6'></small><noframes id='GSiS6'>

    <legend id='GSiS6'><style id='GSiS6'><dir id='GSiS6'><q id='GSiS6'></q></dir></style></legend>

        如何在多個線程中運行`selenium-chromedriver`

        How to run `selenium-chromedriver` in multiple threads(如何在多個線程中運行`selenium-chromedriver`)
        1. <small id='M5Cnx'></small><noframes id='M5Cnx'>

          • <legend id='M5Cnx'><style id='M5Cnx'><dir id='M5Cnx'><q id='M5Cnx'></q></dir></style></legend>

              <tbody id='M5Cnx'></tbody>
            <i id='M5Cnx'><tr id='M5Cnx'><dt id='M5Cnx'><q id='M5Cnx'><span id='M5Cnx'><b id='M5Cnx'><form id='M5Cnx'><ins id='M5Cnx'></ins><ul id='M5Cnx'></ul><sub id='M5Cnx'></sub></form><legend id='M5Cnx'></legend><bdo id='M5Cnx'><pre id='M5Cnx'><center id='M5Cnx'></center></pre></bdo></b><th id='M5Cnx'></th></span></q></dt></tr></i><div class="qwawimqqmiuu" id='M5Cnx'><tfoot id='M5Cnx'></tfoot><dl id='M5Cnx'><fieldset id='M5Cnx'></fieldset></dl></div>
              <bdo id='M5Cnx'></bdo><ul id='M5Cnx'></ul>

                <tfoot id='M5Cnx'></tfoot>
                • 本文介紹了如何在多個線程中運行`selenium-chromedriver`的處理方法,對大家解決問題具有一定的參考價值,需要的朋友們下面隨著小編來一起學習吧!

                  問題描述

                  限時送ChatGPT賬號..

                  我正在使用 seleniumchrome-driver 從某些頁面 scrape 數據,然后使用該信息運行一些額外的任務(例如,在某些頁面上輸入一些評論)

                  I am using selenium and chrome-driver to scrape data from some pages and then run some additional tasks with that information (for example, type some comments on some pages)

                  我的程序有一個按鈕.每次按下它都會調用 thread_(self)(如下),開始一個新線程.目標函數 self.main 具有在 chrome-driver 上運行所有 selenium 工作的代碼.

                  My program has a button. Every time it's pressed it calls the thread_(self) (bellow), starting a new thread. The target function self.main has the code to run all the selenium work on a chrome-driver.

                  def thread_(self):
                      th = threading.Thread(target=self.main)
                      th.start()
                  

                  我的問題是用戶第一次按下后.這個 th 線程將打開瀏覽器 A 并做一些事情.當瀏覽器 A 正在做一些事情時,用戶將再次按下按鈕并打開運行相同 self.main 的瀏覽器 B.我希望每個打開的瀏覽器同時運行.我遇到的問題是,當我運行那個線程函數時,第一個瀏覽器停止并且第二個瀏覽器打開.

                  My problem is that after the user press the first time. This th thread will open browser A and do some stuff. While browser A is doing some stuff, the user will press the button again and open browser B that runs the same self.main. I want each browser opened to run simultaneously. The problem I faced is that when I run that thread function, the first browser stops and the second browser is opened.

                  我知道我的代碼可以無限創建線程.我知道這會影響電腦性能,但我可以接受.我想加快 self.main 完成的工作

                  I know my code can create threads infinitely. And I know that this will affect the pc performance but I am ok with that. I want to speed up the work done by self.main!

                  推薦答案

                  Threading for selenium 加速

                  考慮以下函數來舉例說明與單一驅動程序方法相比,使用 selenium 的線程如何提供一些加速.下面的代碼 scraps 來自 selenium 使用 BeautifulSoup 打開的頁面的 html 標題.頁面列表是links.

                  Threading for selenium speed up

                  Consider the following functions to exemplify how threads with selenium give some speed-up compared to a single driver approach. The code bellow scraps the html title from a page opened by selenium using BeautifulSoup. The list of pages is links.

                  import time
                  from bs4 import BeautifulSoup
                  from selenium import webdriver
                  import threading
                  
                  def create_driver():
                     """returns a new chrome webdriver"""
                     chromeOptions = webdriver.ChromeOptions()
                     chromeOptions.add_argument("--headless") # make it not visible, just comment if you like seeing opened browsers
                     return webdriver.Chrome(options=chromeOptions)  
                  
                  def get_title(url, webdriver=None):  
                     """get the url html title using BeautifulSoup 
                     if driver is None uses a new chrome-driver and quit() after
                     otherwise uses the driver provided and don't quit() after"""
                     def print_title(driver):
                        driver.get(url)
                        soup = BeautifulSoup(driver.page_source,"lxml")
                        item = soup.find('title')
                        print(item.string.strip())
                  
                     if webdriver:
                        print_title(webdriver)  
                     else: 
                        webdriver = create_driver()
                        print_title(webdriver)   
                        webdriver.quit()
                  
                  links = ["https://www.amazon.com", "https://www.google.com", "https://www.youtube.com/", "https://www.facebook.com/", "https://www.wikipedia.org/", 
                  "https://us.yahoo.com/?p=us", "https://www.instagram.com/", "https://www.globo.com/", "https://outlook.live.com/owa/"]
                  

                  現在在上面的 links 上調用 get_tile.

                  Calling now get_tile on the links above.

                  順序方法

                  單個 chrome 驅動程序并按順序傳遞所有鏈接.我的機器需要 22.3 秒(注意:windows).

                  A single chrome driver and passing all links sequentially. Takes 22.3 s my machine (note:windows).

                  start_time = time.time()
                  driver = create_driver()
                  
                  for link in links: # could be 'like' clicks 
                    get_title(link, driver)  
                  
                  driver.quit()
                  print("sequential took ", (time.time() - start_time), " seconds")
                  

                  多線程方法

                  為每個鏈接使用一個線程.結果在 10.5 秒內 >快 2 倍.

                  Using a thread for each link. Results in 10.5 s > 2x faster.

                  start_time = time.time()    
                  threads = [] 
                  for link in links: # each thread could be like a new 'click' 
                      th = threading.Thread(target=get_title, args=(link,))    
                      th.start() # could `time.sleep` between 'clicks' to see whats'up without headless option
                      threads.append(th)        
                  for th in threads:
                      th.join() # Main thread wait for threads finish
                  print("multiple threads took ", (time.time() - start_time), " seconds")
                  

                  這里和這個更好是其他一些工作示例.第二個在 ThreadPool 上使用固定數量的線程.并建議存儲在每個線程上初始化的 chrome-driver 實例比每次都創建-啟動它更快.

                  This here and this better are some other working examples. The second uses a fixed number of threads on a ThreadPool. And suggests that storing the chrome-driver instance initialized on each thread is faster than creating-starting it every time.

                  我仍然不確定這是否是 selenium 的最佳方法有相當大的加速. 因為 threadingin-python?rq=1">無 IO 綁定代碼 將結束順序執行(一個線程一個接一個).由于 Python GIL(全局解釋器鎖),Python 進程無法并行運行線程(利用多個 cpu 核).

                  Still I was not sure this was the optimal approach for selenium to have considerable speed-ups. Since threading on no IO bound code will end-up executed sequentially (one thread after another). Due the Python GIL (Global Interpreter Lock) a Python process cannot run threads in parallel (utilize multiple cpu-cores).

                  使用包multiprocessing

                  To try to overcome the Python GIL limitation using the package multiprocessing and Processes class I wrote the following code and I ran multiple tests. I even added random page hyperlink clicks on the get_title function above. Additional code is here.

                  start_time = time.time() 
                  
                  processes = [] 
                  for link in links: # each thread a new 'click' 
                      ps = multiprocessing.Process(target=get_title, args=(link,))    
                      ps.start() # could sleep 1 between 'clicks' with `time.sleep(1)``
                      processes.append(ps)        
                  for ps in processes:
                      ps.join() # Main wait for processes finish
                  
                  return (time.time() - start_time)
                  

                  與我的預期相反 基于 Python multiprocessing.Processselenium 平均并行度 threading.Thread 慢大約 8%. 但很明顯,booth 的平均速度比順序方法快兩倍多.剛剛發現 selenium chrome-driver 命令使用 HTTP-Requets (如 POST, GET) 所以它是I/O 受限,因此它釋放了 Python GIL,確實使其在線程中并行.

                  Contrary of what I would expect Python multiprocessing.Process based parallelism for selenium in average was around 8% slower than threading.Thread. But obviously booth were in average more than twice faster than the sequential approach. Just found out that selenium chrome-driver commands uses HTTP-Requets (like POST, GET) so it is I/O bounded therefore it releases the Python GIL indeed making it parallel in threads.

                  這不是一個確定的答案,因為我的測試只是一個很小的例子.此外,我使用的是 Windows 和 multiprocessing 在這種情況下有很多限制.每個新的 Process 都不像 Linux 中的分叉,這意味著除了其他缺點外,還浪費了大量內存.

                  This is not a definitive answer as my tests were only a tiny example. Also I'm using Windows and multiprocessing have many limitations in this case. Each new Process is not a fork like in Linux meaning, among other downsides, a lot of memory is wasted.

                  考慮到所有這些:根據用例,線程可能與嘗試更重的進程方法(特別是對于 Windows 用戶)一樣好或更好.

                  Taking all that in account: It seams that depending on the use case threads maybe as good or better than trying the heavier approach of process (specially for Windows users).

                  這篇關于如何在多個線程中運行`selenium-chromedriver`的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!

                  【網站聲明】本站部分內容來源于互聯網,旨在幫助大家更快的解決問題,如果有圖片或者內容侵犯了您的權益,請聯系我們刪除處理,感謝您的支持!

                  相關文檔推薦

                  What exactly is Python multiprocessing Module#39;s .join() Method Doing?(Python 多處理模塊的 .join() 方法到底在做什么?)
                  Passing multiple parameters to pool.map() function in Python(在 Python 中將多個參數傳遞給 pool.map() 函數)
                  multiprocessing.pool.MaybeEncodingError: #39;TypeError(quot;cannot serialize #39;_io.BufferedReader#39; objectquot;,)#39;(multiprocessing.pool.MaybeEncodingError: TypeError(cannot serialize _io.BufferedReader object,)) - IT屋-程序員軟件開
                  Python Multiprocess Pool. How to exit the script when one of the worker process determines no more work needs to be done?(Python 多進程池.當其中一個工作進程確定不再需要完成工作時,如何退出腳本?) - IT屋-程序員
                  How do you pass a Queue reference to a function managed by pool.map_async()?(如何將隊列引用傳遞給 pool.map_async() 管理的函數?)
                  yet another confusion with multiprocessing error, #39;module#39; object has no attribute #39;f#39;(與多處理錯誤的另一個混淆,“模塊對象沒有屬性“f)

                • <legend id='ldXcV'><style id='ldXcV'><dir id='ldXcV'><q id='ldXcV'></q></dir></style></legend>

                  <tfoot id='ldXcV'></tfoot>
                      <tbody id='ldXcV'></tbody>
                      <bdo id='ldXcV'></bdo><ul id='ldXcV'></ul>

                        <small id='ldXcV'></small><noframes id='ldXcV'>

                        <i id='ldXcV'><tr id='ldXcV'><dt id='ldXcV'><q id='ldXcV'><span id='ldXcV'><b id='ldXcV'><form id='ldXcV'><ins id='ldXcV'></ins><ul id='ldXcV'></ul><sub id='ldXcV'></sub></form><legend id='ldXcV'></legend><bdo id='ldXcV'><pre id='ldXcV'><center id='ldXcV'></center></pre></bdo></b><th id='ldXcV'></th></span></q></dt></tr></i><div class="qwawimqqmiuu" id='ldXcV'><tfoot id='ldXcV'></tfoot><dl id='ldXcV'><fieldset id='ldXcV'></fieldset></dl></div>

                          1. 主站蜘蛛池模板: 在线播放中文字幕 | 国产大毛片 | 久久国产精品免费一区二区三区 | 三级视频久久 | 国产精品久久久久久福利一牛影视 | 久久激情视频 | 欧美电影免费观看高清 | 奇米av| 香蕉视频在线播放 | 91免费电影 | 亚洲日日夜夜 | 99精品免费 | 欧美综合一区二区 | 毛片一区 | 国产精品99久久久久久动医院 | 国产成人福利视频在线观看 | 久久夜视频| 午夜电影网 | 亚洲综合色网 | 福利片一区二区 | 欧美精品在线免费观看 | 亚洲精品乱码久久久久久蜜桃91 | 免费国产黄网站在线观看视频 | 亚洲国产一区二区三区四区 | 久久精彩视频 | www.9191 | 中文字幕 国产 | 国产高清在线精品 | 欧美中国少妇xxx性高请视频 | 久久久123| 久久久999免费视频 999久久久久久久久6666 | 亚洲综合大片69999 | 欧美亚洲国产精品 | 精品久久中文字幕 | 欧美久久久久久久久 | 日韩精品影院 | 蜜臀网 | 色姑娘综合网 | 综合国产 | 一区二区三区国产在线观看 | 日韩精品在线播放 |