serp.fast

Mozilla Readability

The pure-JavaScript library Firefox uses for Reader Mode – extracts the primary article from an HTML document with no dependencies.

Maintained by Nathan Kessler·Updated

Open source scraping frameworks give engineering teams full control over their web data pipeline. You choose where to deploy, how to scale, and what data to collect – with no vendor lock-in or per-request pricing. The trade-off is infrastructure maintenance and anti-bot engineering, which commercial APIs handle for you.

Features

JS Rendering
Structured Output
Open Source
Self-Hosted Option
Pricing:Free

Editorial assessment

The same rule-based extractor that powers Firefox Reader Mode, packaged as a standalone Node module with zero dependencies. Fast, deterministic, and aggressive about stripping boilerplate – quality holds up surprisingly well against larger systems on news and article content. The natural choice for Node pipelines or browser-side extraction. Pair it with Cheerio if you also need DOM traversal, reach for Trafilatura instead in Python, or use Crawl4AI/Firecrawl when you need rendering and fetching alongside extraction.

How Mozilla Readability compares

Trafilatura

Trafilatura is the Python equivalent and tends to win head-to-head accuracy benchmarks on long-tail content.

Cheerio

Cheerio is a general jQuery-like parser – use it alongside Readability when you also need custom DOM selection.

Crawl4AI

Crawl4AI handles fetching, JS rendering, and LLM-ready markdown that Readability leaves to you.

Weekly briefing — tool launches, legal shifts, market data.

Visit

Mozilla Readability

Visit →