Scrapy爬网在终端中显示输出,但不在json Excel文件中

我能够在终端中显示刮擦的结果,但是当我插入-o .csv时,编码的第三行将在json excel文件中输出,但第一行和第二行没有(尝试刮擦开始日期和结束日期,如下图所示)

编码:

 def parse(self,response):
    for quote in response.css("div.views-row"):
        yield {
            'Max ERC':quote.xpath('.//div[strong[.="Max ERC Funding"]]/following-sibling::text()[1]').extract(),'Max ERC1':quote.xpath('.//div[strong[.="Max ERC Funding"]]/following-sibling::text()').extract(),'TEST':quote.xpath('./div[contains(@class,"views-field-acronym")]/span[contains(@class,"field-content")]/text()').extract()
              }

在excel json文件中清空

Scrapy爬网在终端中显示输出,但不在json Excel文件中

端子中有Max ERC和MAX ERC1的输出:

Scrapy爬网在终端中显示输出,但不在json Excel文件中

html代码:

Scrapy爬网在终端中显示输出,但不在json Excel文件中

panlixin 回答:Scrapy爬网在终端中显示输出,但不在json Excel文件中

我不确定,但是我想这是 list csv 混为一谈,通常我想避免这种情况。您可以尝试使用此解析器解决问题吗?

fund,_,start_end_date = r.xpath('.//div[strong[.="Max ERC Funding"]]/following-sibling::text()').extract()
fund = fund.strip()
# import re
start_date,end_date = re.findall(r"\d{4}-\d{2}-\d{2}",start_end_date)
acronym = r.xpath('./div[contains(@class,"views-field-acronym")]/span[contains(@class,"field-content")]/text()').extract()[0]
yield {
  "fund": fund,"start_date": start_date,"end_date": end_date,"acronym": acronym
}
本文链接:https://www.f2er.com/3154493.html

大家都在问