Crawlee is a TypeScript library for web scraping and browser automation with human-like stealth. Supports Playwright, Puppeteer, proxy rotation, and persistent queues.
Ferret v2 is a Go-based declarative system for web scraping that introduces a native Go API and a compatibility layer to ease migration from v1. It balances embeddability, speed, and API evolution.
Headless Chrome Crawler offers a high-level API on Puppeteer for scraping dynamic JS-heavy websites with concurrency, caching, and jQuery injection. Ideal for complex scraping tasks.
Maigret is a Python-based OSINT tool that scrapes public profiles by username from 3,000+ sites without API keys. It features adaptive scraping, anti-blocking, and a web interface.
Pydoll is a Python library for Chromium automation using Chrome DevTools Protocol. It offers async-native APIs and Pydantic-powered data extraction for structured, validated scraping.
undetected-chromedriver patches Selenium’s Chromedriver to bypass anti-bot defenses like Distill Network and DataDome. It supports Chrome beta and Chromium-based browsers with ease.
Scrapy is a Python framework designed for efficient and extensible web scraping, featuring a powerful selector system and item pipelines for data extraction and processing.
Requests-HTML extends Python’s Requests library with Chromium-based JavaScript rendering, CSS/XPath selectors, and async support for scraping dynamic web pages easily.
Scrapling offers an adaptive web scraping framework with AI integration to handle site changes and anti-bot systems, supporting large-scale concurrent crawling with proxy rotation.