久久久久久久av_日韩在线中文_看一级毛片视频_日本精品二区_成人深夜福利视频_武道仙尊动漫在线观看

<legend id='YiBE9'><style id='YiBE9'><dir id='YiBE9'><q id='YiBE9'></q></dir></style></legend>

  • <small id='YiBE9'></small><noframes id='YiBE9'>

        <bdo id='YiBE9'></bdo><ul id='YiBE9'></ul>
      <tfoot id='YiBE9'></tfoot>
      1. <i id='YiBE9'><tr id='YiBE9'><dt id='YiBE9'><q id='YiBE9'><span id='YiBE9'><b id='YiBE9'><form id='YiBE9'><ins id='YiBE9'></ins><ul id='YiBE9'></ul><sub id='YiBE9'></sub></form><legend id='YiBE9'></legend><bdo id='YiBE9'><pre id='YiBE9'><center id='YiBE9'></center></pre></bdo></b><th id='YiBE9'></th></span></q></dt></tr></i><div class="qwawimqqmiuu" id='YiBE9'><tfoot id='YiBE9'></tfoot><dl id='YiBE9'><fieldset id='YiBE9'></fieldset></dl></div>
      2. 是否可以從 Scrapy spider 運行另一個蜘蛛?

        Is it possible to run another spider from Scrapy spider?(是否可以從 Scrapy spider 運行另一個蜘蛛?)
        1. <legend id='YCTlj'><style id='YCTlj'><dir id='YCTlj'><q id='YCTlj'></q></dir></style></legend>

              <tbody id='YCTlj'></tbody>

            <small id='YCTlj'></small><noframes id='YCTlj'>

          • <tfoot id='YCTlj'></tfoot>

                  <bdo id='YCTlj'></bdo><ul id='YCTlj'></ul>

                  <i id='YCTlj'><tr id='YCTlj'><dt id='YCTlj'><q id='YCTlj'><span id='YCTlj'><b id='YCTlj'><form id='YCTlj'><ins id='YCTlj'></ins><ul id='YCTlj'></ul><sub id='YCTlj'></sub></form><legend id='YCTlj'></legend><bdo id='YCTlj'><pre id='YCTlj'><center id='YCTlj'></center></pre></bdo></b><th id='YCTlj'></th></span></q></dt></tr></i><div class="qwawimqqmiuu" id='YCTlj'><tfoot id='YCTlj'></tfoot><dl id='YCTlj'><fieldset id='YCTlj'></fieldset></dl></div>
                  本文介紹了是否可以從 Scrapy spider 運行另一個蜘蛛?的處理方法,對大家解決問題具有一定的參考價值,需要的朋友們下面隨著小編來一起學(xué)習(xí)吧!

                  問題描述

                  限時送ChatGPT賬號..

                  現(xiàn)在我有 2 只蜘蛛,我想做的是

                  For now I have 2 spiders, what I would like to do is

                  1. Spider 1 轉(zhuǎn)到 url1 并且如果出現(xiàn) url2 ,用 url2<調(diào)用蜘蛛 2/代碼>.也使用管道保存url1的內(nèi)容.
                  2. 蜘蛛2url2做點什么.
                  1. Spider 1 goes to url1 and if url2 appears, call spider 2 with url2. Also saves the content of url1 by using pipeline.
                  2. Spider 2 goes to url2 and do something.

                  由于兩種蜘蛛的復(fù)雜性,我想將它們分開.

                  Due to the complexities of both spiders I would like to have them separated.

                  我使用 scrapy crawl 的嘗試:

                  def parse(self, response):
                      p = multiprocessing.Process(
                          target=self.testfunc())
                      p.join()
                      p.start()
                  
                  def testfunc(self):
                      settings = get_project_settings()
                      crawler = CrawlerRunner(settings)
                      crawler.crawl(<spidername>, <arguments>)
                  

                  它會加載設(shè)置但不會抓取:

                  It does load the settings but doesn't crawl:

                  2015-08-24 14:13:32 [scrapy] INFO: Enabled extensions: CloseSpider, LogStats, CoreStats, SpiderState
                  2015-08-24 14:13:32 [scrapy] INFO: Enabled downloader middlewares: DownloadTimeoutMiddleware, UserAgentMiddleware, RetryMiddleware, HttpAuthMiddleware, DefaultHeadersMiddleware, MetaRefreshMiddleware, HttpCompressionMiddleware, RedirectMiddleware, CookiesMiddleware, ChunkedTransferMiddleware, DownloaderStats
                  2015-08-24 14:13:32 [scrapy] INFO: Enabled spider middlewares: HttpErrorMiddleware, OffsiteMiddleware, RefererMiddleware, UrlLengthMiddleware, DepthMiddleware
                  2015-08-24 14:13:32 [scrapy] INFO: Spider opened
                  2015-08-24 14:13:32 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
                  

                  文檔中有一個關(guān)于從腳本啟動的示例,但我想做的是在使用 scrapy crawl 命令時啟動另一個蜘蛛.

                  The documentations has a example about launching from script, but what I'm trying to do is launch another spider while using scrapy crawl command.

                  完整代碼

                  from scrapy.crawler import CrawlerRunner
                  from scrapy.utils.project import get_project_settings
                  from twisted.internet import reactor
                  from multiprocessing import Process
                  import scrapy
                  import os
                  
                  
                  def info(title):
                      print(title)
                      print('module name:', __name__)
                      if hasattr(os, 'getppid'):  # only available on Unix
                          print('parent process:', os.getppid())
                      print('process id:', os.getpid())
                  
                  
                  class TestSpider1(scrapy.Spider):
                      name = "test1"
                      start_urls = ['http://www.google.com']
                  
                      def parse(self, response):
                          info('parse')
                          a = MyClass()
                          a.start_work()
                  
                  
                  class MyClass(object):
                  
                      def start_work(self):
                          info('start_work')
                          p = Process(target=self.do_work)
                          p.start()
                          p.join()
                  
                      def do_work(self):
                  
                          info('do_work')
                          settings = get_project_settings()
                          runner = CrawlerRunner(settings)
                          runner.crawl(TestSpider2)
                          d = runner.join()
                          d.addBoth(lambda _: reactor.stop())
                          reactor.run()
                          return
                  
                  class TestSpider2(scrapy.Spider):
                  
                      name = "test2"
                      start_urls = ['http://www.google.com']
                  
                      def parse(self, response):
                          info('testspider2')
                          return
                  

                  我希望是這樣的:

                  1. scrapy 抓取測試1(例如,當(dāng) response.status_code 為 200 時:)
                  2. 在test1中,調(diào)用scrapy crawl test2

                  推薦答案

                  我不會深入給出,因為這個問題真的很老,但我會繼續(xù)從官方 Scrappy 文檔中刪除這個片段......你非常接近!哈哈

                  I won't go in depth given since this question is really old but I'll go ahead drop this snippet from the official Scrappy docs.... You are very close! lol

                  import scrapy
                  from scrapy.crawler import CrawlerProcess
                  
                  class MySpider1(scrapy.Spider):
                      # Your first spider definition
                      ...
                  
                  class MySpider2(scrapy.Spider):
                      # Your second spider definition
                      ...
                  
                  process = CrawlerProcess()
                  process.crawl(MySpider1)
                  process.crawl(MySpider2)
                  process.start() # the script will block here until all crawling jobs are finished
                  

                  https://doc.scrapy.org/en/latest/topics/實踐.html

                  然后使用回調(diào),你可以在你的蜘蛛之間傳遞項目做你所說的邏輯函數(shù)

                  And then using callbacks you can pass items between your spiders do do w.e logic functions your talking about

                  這篇關(guān)于是否可以從 Scrapy spider 運行另一個蜘蛛?的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網(wǎng)!

                  【網(wǎng)站聲明】本站部分內(nèi)容來源于互聯(lián)網(wǎng),旨在幫助大家更快的解決問題,如果有圖片或者內(nèi)容侵犯了您的權(quán)益,請聯(lián)系我們刪除處理,感謝您的支持!

                  相關(guān)文檔推薦

                  What exactly is Python multiprocessing Module#39;s .join() Method Doing?(Python 多處理模塊的 .join() 方法到底在做什么?)
                  Passing multiple parameters to pool.map() function in Python(在 Python 中將多個參數(shù)傳遞給 pool.map() 函數(shù))
                  multiprocessing.pool.MaybeEncodingError: #39;TypeError(quot;cannot serialize #39;_io.BufferedReader#39; objectquot;,)#39;(multiprocessing.pool.MaybeEncodingError: TypeError(cannot serialize _io.BufferedReader object,)) - IT屋-程序員軟件開
                  Python Multiprocess Pool. How to exit the script when one of the worker process determines no more work needs to be done?(Python 多進程池.當(dāng)其中一個工作進程確定不再需要完成工作時,如何退出腳本?) - IT屋-程序員
                  How do you pass a Queue reference to a function managed by pool.map_async()?(如何將隊列引用傳遞給 pool.map_async() 管理的函數(shù)?)
                  yet another confusion with multiprocessing error, #39;module#39; object has no attribute #39;f#39;(與多處理錯誤的另一個混淆,“模塊對象沒有屬性“f)
                  <i id='VmOGD'><tr id='VmOGD'><dt id='VmOGD'><q id='VmOGD'><span id='VmOGD'><b id='VmOGD'><form id='VmOGD'><ins id='VmOGD'></ins><ul id='VmOGD'></ul><sub id='VmOGD'></sub></form><legend id='VmOGD'></legend><bdo id='VmOGD'><pre id='VmOGD'><center id='VmOGD'></center></pre></bdo></b><th id='VmOGD'></th></span></q></dt></tr></i><div class="qwawimqqmiuu" id='VmOGD'><tfoot id='VmOGD'></tfoot><dl id='VmOGD'><fieldset id='VmOGD'></fieldset></dl></div>
                      <tbody id='VmOGD'></tbody>

                    <small id='VmOGD'></small><noframes id='VmOGD'>

                      <tfoot id='VmOGD'></tfoot>
                      • <bdo id='VmOGD'></bdo><ul id='VmOGD'></ul>
                        <legend id='VmOGD'><style id='VmOGD'><dir id='VmOGD'><q id='VmOGD'></q></dir></style></legend>
                          1. 主站蜘蛛池模板: 91影视| 精品香蕉一区二区三区 | 亚洲精品日日夜夜 | 极品粉嫩国产48尤物在线播放 | 亚洲欧美一区二区三区国产精品 | 欧美色欧美亚洲另类七区 | 欧美色综合天天久久综合精品 | 久久国产精品无码网站 | 91综合网 | 日本在线综合 | 中文字幕免费观看 | 亚洲精品久久久久中文字幕欢迎你 | 亚洲1区 | 九九热这里 | 国内精品视频在线观看 | 国产在线第一页 | 中文字幕一区二区三区乱码在线 | 久久精品在线播放 | 性做久久久久久免费观看欧美 | 日本精品一区二区三区在线观看 | 亚洲最大av | 午夜视频一区 | 久久亚洲经典 | 色综合欧美| 精品久久久久久18免费网站 | 伊人精品久久久久77777 | 国产精品无码久久久久 | 欧美专区日韩专区 | 日韩欧美亚洲综合 | 成人欧美一区二区三区 | 夜夜爽99久久国产综合精品女不卡 | 国产视频1区 | 日韩av成人 | 国产精品久久 | 亚洲国产免费 | 在线观看 亚洲 | 一级高清 | 国产成人精品免费视频大全最热 | 国产午夜精品一区二区三区嫩草 | 亚洲一区中文字幕在线观看 | 一区二区国产精品 |