Collectives™ on Stack Overflow
Find centralized, trusted content and collaborate around the technologies you use most.
Learn more about Collectives
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
Learn more about Teams
Rewriting this to make what I'm looking for help with clearer. I'm trying to scrape a page of search results like
this
http://search.people.com.cn/cnpeople/search.do?pageNum=1&keyword=%C8%F0%B5%E4&siteName=news&facetFlag=true&nodeType=belongsId&nodeId=0
But when I run it in Scrapy, the requests seem to be redirected:
2020-01-10 09:55:38 [scrapy.downloadermiddlewares.redirect] DEBUG: Redirecting (302) to http://search.people.com.cn/cnpeople/news/getNewsResult.jsp> from http://search.people.com.cn/cnpeople/search.do?pageNum=7&keyword=%C8%F0%B5%E4&siteName=news&facetFlag=true&nodeType=belongsId&nodeId=0>
And then nothing is scraped.
Is that just the way the website works to redirect me to a list of results, or is it trying to prevent me scraping it? Is there anything I can do?
Below is my spider code:
import scrapy
class QuotesSpider(scrapy.Spider):
name = "RMW"
def start_requests(self):
# starturls = ['http://search.people.com.cn/cnpeople/search.do?pageNum=1&keyword=%C8%F0%B5%E4&siteName=news&facetFlag=true&nodeType=belongsId&nodeId=0',]
numbers = list(range(1, 10, 1))
for num in numbers:
url = 'http://search.people.com.cn/cnpeople/search.do?pageNum='+str(num)+'&keyword=%C8%F0%B5%E4&siteName=news&facetFlag=true&nodeType=belongsId&nodeId=0'
urls = []
urls.append(url)
for url in urls:
yield scrapy.Request(url=url, callback=self.parse)
def parse(self, response):
for link in response.css("ul"):
yield {
'link': link.css("a::attr(href)").get()
I'd really appreciate any help resolving this from somebody with more expertise in the area.
–
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.