Replies: 1 comment 6 replies
-
Hello, and thanks for your interest in Crawlee! We haven't looked into selenium-driverless yet, but at a glance, it looks interesting. Could you explain how it's different from "regular" selenium, as far as scraping/crawling is concerned? By the way, a feature request for a Selenium-based crawler already exists - #284, but it doesn't seem to get much traction. |
Beta Was this translation helpful? Give feedback.
6 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
First thank you so much for your effort in this package it's really great.
My question is: Is there any consideration to add new crawlers like selenium-driverless in current time? Recently I've been trying PlaywrightCrawler with many websites and some of them could detect it & raise cloudflare, though testing same sites with selenium-driverless could easily get through.
For example this site with the following code:
Expected behavior
Crawler should visit homepage, extract products links, visit each product & extract its data, very basic crawling.
Actual behavior
Success visiting the homepage, but all further requests are blocked with cloudflare.
What I tried
Of course adding a whole new crawler is a headache & not that simple, but I'd like to hear your opinion.
If you suggest any flags I can pass to the PlaywrightCrawler to reduce its detection I'll be very thankful too.
Thanks in advance.
Beta Was this translation helpful? Give feedback.
All reactions