Noureddine RAMDI Dinour

Lead Developer & AI Enthusiast — Software Architecture, AI/LLM, Infrastructure Automation

Organizations

29 results for Web-Scraping

Clear filter

Crawlee: a TypeScript library for stealthy web scraping and browser automation
Crawlee is a TypeScript library for web scraping and browser automation with human-like stealth. Supports Playwright, Puppeteer, proxy rotation, and persistent queues.
github-stars typescript web scraping browser automation playwright Created Sat, 02 May 2026 20:07:04 +0000
Ferret v2: A declarative Go engine for web data extraction with a new API architecture
Ferret v2 is a Go-based declarative system for web scraping that introduces a native Go API and a compatibility layer to ease migration from v1. It balances embeddability, speed, and API evolution.
github-stars go web-scraping data-extraction api-design Created Sat, 02 May 2026 20:07:04 +0000
Headless Chrome Crawler: Simplifying Dynamic Web Scraping with Puppeteer
Headless Chrome Crawler offers a high-level API on Puppeteer for scraping dynamic JS-heavy websites with concurrency, caching, and jQuery injection. Ideal for complex scraping tasks.
github-stars javascript web-scraping puppeteer headless-chrome Created Sat, 02 May 2026 20:07:04 +0000
Maigret: A resilient OSINT username scraper across thousands of sites
Maigret is a Python-based OSINT tool that scrapes public profiles by username from 3,000+ sites without API keys. It features adaptive scraping, anti-blocking, and a web interface.
github-stars python osint web scraping username enumeration Created Sat, 02 May 2026 20:07:04 +0000
Pydoll: Async-native Chromium automation with typed extraction for web scraping
Pydoll is a Python library for Chromium automation using Chrome DevTools Protocol. It offers async-native APIs and Pydantic-powered data extraction for structured, validated scraping.
github-stars python chromium async web scraping Created Sat, 02 May 2026 20:07:04 +0000
undetected-chromedriver: patching Selenium to evade anti-bot detection
undetected-chromedriver patches Selenium’s Chromedriver to bypass anti-bot defenses like Distill Network and DataDome. It supports Chrome beta and Chromium-based browsers with ease.
github-stars python selenium chromedriver web-scraping Created Sat, 02 May 2026 20:07:04 +0000
Scrapy: a modular Python framework for scalable web scraping
Scrapy is a Python framework designed for efficient and extensible web scraping, featuring a powerful selector system and item pipelines for data extraction and processing.
github-stars python web scraping scrapy data extraction Created Sun, 26 Apr 2026 23:47:28 +0000
Requests-HTML: Pythonic web scraping with built-in JavaScript rendering
Requests-HTML extends Python’s Requests library with Chromium-based JavaScript rendering, CSS/XPath selectors, and async support for scraping dynamic web pages easily.
github-stars python web scraping html parsing javascript rendering Created Sun, 26 Apr 2026 17:51:11 +0000
Scrapling: adaptive web scraping with AI integration for resilient data extraction
Scrapling offers an adaptive web scraping framework with AI integration to handle site changes and anti-bot systems, supporting large-scale concurrent crawling with proxy rotation.
github-stars python web scraping adaptive parsing ai integration Created Sun, 26 Apr 2026 17:51:11 +0000