<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Common-Crawl on Noureddine RAMDI</title><link>https://ramdi.fr/tags/common-crawl/</link><description>Recent content in Common-Crawl on Noureddine RAMDI</description><generator>Hugo</generator><language>en</language><lastBuildDate>Sat, 23 May 2026 20:41:27 +0000</lastBuildDate><atom:link href="https://ramdi.fr/tags/common-crawl/index.xml" rel="self" type="application/rss+xml"/><item><title>news-please: a Python crawler for structured news extraction with Common Crawl support</title><link>https://ramdi.fr/github-stars/news-please-a-python-crawler-for-structured-news-extraction-with-common-crawl-support/</link><pubDate>Mon, 04 May 2026 10:23:02 +0000</pubDate><guid>https://ramdi.fr/github-stars/news-please-a-python-crawler-for-structured-news-extraction-with-common-crawl-support/</guid><description>news-please is a Python tool built on Scrapy for crawling and extracting structured news data, supporting Common Crawl archives and multiple storage backends.</description></item></channel></rss>