Noureddine RAMDI / MD-This-Page: a Chrome extension that turns web pages into clean Markdown for LLM workflows

Created Mon, 04 May 2026 10:23:02 +0000 Modified Sat, 23 May 2026 20:41:27 +0000

Ademking/MD-This-Page

MD-This-Page solves a common pain point: extracting meaningful content from cluttered web pages and converting it into a format that large language models (LLMs) can digest efficiently.

What MD-This-Page does and how it works

MD-This-Page is a Chrome extension built with the Plasmo framework and React. Its primary function is to transform any webpage you visit into clean, well-structured Markdown with a single click. This Markdown output is specifically tailored to be “LLM-ready,” meaning it strips away unnecessary clutter like navigation menus, ads, and boilerplate content to preserve only the main article or content body.

Under the hood, the extension relies on two main libraries for its extraction pipeline. First, it uses Mozilla’s Readability library to identify and isolate the core content of the webpage. Readability parses the DOM and heuristically removes extraneous elements, leaving behind just the main article or relevant text.

Once Readability extracts the cleaned HTML, MD-This-Page converts this HTML into Markdown using Turndown, a well-known HTML-to-Markdown converter. This two-step process—content extraction followed by format conversion—is a clean pattern that effectively prepares web content for AI workflows where Markdown is preferred for its simplicity and readability.

The extension also offers customizable output options, letting users toggle whether to include images, links, and metadata in the exported Markdown. Export methods include copying the Markdown to the clipboard, saving it as a .md file, or generating a prompt-format version optimized for feeding into LLMs.

MD-This-Page is built as a Manifest V3 Chrome extension, reflecting the latest Chrome extension standards and security requirements. Styling is handled with Tailwind CSS, which keeps the UI simple and responsive.

Technical strengths and tradeoffs

The standout technical feature of MD-This-Page is the two-stage extraction pipeline combining Mozilla Readability and Turndown. This approach ensures that the content is not only clean but also semantically structured in Markdown, which is far easier for LLMs to parse compared to raw HTML.

The decision to build the extension with Plasmo and React offers a modern developer experience. Plasmo streamlines building Manifest V3 extensions with React, providing hot reloading, easy packaging, and a well-structured project setup. The codebase is surprisingly clean, with a clear separation between content extraction logic and UI components.

However, Manifest V3 introduces some constraints, such as stricter permissions and background service worker limitations, which can complicate extension behavior and performance. MD-This-Page handles these gracefully but it’s a tradeoff developers should understand when building similar extensions.

Customization options for output give the extension versatility but also add complexity to the UI and code paths. The code manages this well but there’s a balance between configurability and simplicity—users who want a no-frills experience may find toggles distracting.

One limitation is that content extraction relies heavily on Readability’s heuristics, which, while effective for many articles, may occasionally misidentify the main content or omit important context. This is a common challenge with any automated content extraction.

Getting started with MD-This-Page

This extension is built with Plasmo and React.

Prerequisites

  • Node.js
  • pnpm (or npm, yarn)

Installation & Development

  1. Clone the repository and navigate to the project directory:

    cd md-this-page
    
  2. Install dependencies:

    pnpm install
    
  3. Run the development server:

    pnpm dev
    

    This will run the Plasmo dev server and generate a build/chrome-mv3-dev directory.

  4. Load the extension in Chrome:

    • Go to chrome://extensions/
    • Enable Developer mode
    • Click Load unpacked
    • Select the build/chrome-mv3-dev directory from this project.

Building for Production

To create a production build of the extension:

pnpm build

This will output the production-ready extension into build/chrome-mv3-prod.

verdict: who should consider MD-This-Page?

MD-This-Page is a practical tool for developers and AI practitioners who frequently feed web content into language models and want that content in a clean Markdown format. Its approach to content extraction and conversion is straightforward and effective for many common article-style pages.

That said, it’s not a silver bullet for all types of web content. Pages with complex layouts, dynamic content, or non-article formats may see less reliable extraction results due to the reliance on Readability heuristics.

If you build or maintain AI tools that require efficient ingestion of web data, or if you want a quick way to generate Markdown from webpages without manual cleanup, this extension is worth a look. It also serves as a solid example of how to combine existing libraries into a cohesive, user-friendly developer tool.

For browser extension developers, the repo is useful as a hands-on reference for Manifest V3 development with Plasmo and React, showcasing a clean architecture and thoughtful UX design.

Overall, MD-This-Page fills a real need with a practical, well-engineered solution that balances functionality and developer experience.


→ GitHub Repo: Ademking/MD-This-Page ⭐ 636 · TypeScript