我一直在嘗試使用 async 來擺脫 parse 方法中的額外回呼。我知道有一個庫inline_requests可以做到。
但是,我希望堅持使用異步。我無法理解的是如何在 parse 方法中發出 post 請求。
當我使用 inline_requests 發出 post 請求時,我成功了:
import scrapy
from inline_requests import inline_requests
class HkexNewsSpider(scrapy.Spider):
name = "hkexnews"
start_url = "http://www.hkexnews.hk/sdw/search/searchsdw.aspx"
def start_requests(self):
yield scrapy.Request(self.start_url,callback=self.parse_item)
@inline_requests
def parse_item(self,response):
payload = {item.css('::attr(name)').get(default=''):item.css('::attr(value)').get(default='') for item in response.css("input[name]")}
payload['__EVENTTARGET'] = 'btnSearch'
payload['txtStockCode'] = '00001'
payload['txtParticipantID'] = 'A00001'
resp = yield scrapy.FormRequest(self.start_url, formdata=payload, dont_filter=True)
total_value = resp.css(".ccass-search-total > .shareholding > .value::text").get()
yield {"Total Value":total_value}
在嘗試使用異步發出 post 請求時,我得到 None 結果:
async def parse(self,response):
payload = {item.css('::attr(name)').get(default=''):item.css('::attr(value)').get(default='') for item in response.css("input[name]")}
payload['__EVENTTARGET'] = 'btnSearch'
payload['txtStockCode'] = '00001'
payload['txtParticipantID'] = 'A00001'
request = response.follow(self.start_url,method='POST',body=payload, dont_filter=True)
resp = await self.crawler.engine.download(request, self)
total_value = resp.css(".ccass-search-total > .shareholding > .value::text").get()
yield {"Total Value":total_value}
如何使用后一種方法獲取結果?
uj5u.com熱心網友回復:
import scrapy
class HkexNewsSpider(scrapy.Spider):
name = "hkexnews"
start_urls = ['http://www.hkexnews.hk/sdw/search/searchsdw.aspx']
async def parse(self, response):
payload = {item.css('::attr(name)').get(default=''): item.css('::attr(value)').get(default='') for item in response.css("input[name]")}
payload['__EVENTTARGET'] = 'btnSearch'
payload['txtStockCode'] = '00001'
payload['txtParticipantID'] = 'A00001'
request = scrapy.FormRequest(self.start_urls[0], formdata=payload, dont_filter=True)
resp = await self.crawler.engine.download(request, self)
total_value = resp.css(".ccass-search-total > .shareholding > .value::text").get()
yield {"Total Value": total_value}
輸出:
{'Total Value': '2,546,531,648'}
轉載請註明出處,本文鏈接:https://www.uj5u.com/houduan/401192.html
上一篇:無法刮桌子