serp.fast

HTTPx + Parsel

Modern Python HTTP client (HTTPx) paired with Scrapy's extraction library (Parsel) – lightweight async scraping without a framework.

Nathan Kessler
By Nathan KesslerUpdated

Each tool is evaluated against our methodology using public docs, vendor demos, and hands-on testing.

Open source scraping frameworks give engineering teams full control over their web data pipeline. You choose where to deploy, how to scale, and what data to collect – with no vendor lock-in or per-request pricing. The trade-off is infrastructure maintenance and anti-bot engineering, which commercial APIs handle for you.

Features

JS Rendering
Structured Output
Open Source
Self-Hosted Option
Pricing:Free

Editorial assessment

The minimalist's scraping stack – HTTPx for async HTTP requests plus Parsel for CSS/XPath extraction. Lighter than Scrapy, more capable than requests + Beautiful Soup. You're assembling your own framework from parts. No crawling orchestration, no rate limiting, no data pipelines. Best for developers who want full control and are comfortable building their own infrastructure.

How HTTPx + Parsel compares

Scrapy

Scrapy uses Parsel internally and adds everything else – crawling, rate limiting, data export.

Crawl4AI

Crawl4AI provides a complete package if you want AI-native output without assembling parts.

Beautiful Soup

Beautiful Soup is the traditional pairing with requests, though Parsel's XPath support is superior.

Frequently asked questions

Is HTTPx + Parsel free?

Yes. Both are free, BSD-licensed Python libraries you install with pip. There is no paid tier, no usage metering, and no hosted service to pay for. HTTPx handles the HTTP requests and Parsel handles CSS and XPath extraction. Your only real costs are the infrastructure you run them on and any proxies or rendering services you add yourself, because neither library bundles those.

Is HTTPx + Parsel open source?

Yes. HTTPx is open source under a BSD license, maintained on GitHub by the encode organization. Parsel is also BSD-licensed and lives under the Scrapy organization, having been pulled out of Scrapy back in 2015. Both run as plain libraries inside your own Python code, so you can self-host them on any environment you control. There is no vendor to contact and no separate commercial terms to accept.

Does HTTPx + Parsel render JavaScript?

No. HTTPx is an HTTP client that fetches raw responses. It supports synchronous and asynchronous requests and HTTP/2, but it does not run a browser engine. Parsel only parses HTML, XML, or JSON you already have. Pages that build their content with client-side JavaScript will not be fully populated. For those, pair this stack with a browser tool like Playwright, or pick Crawl4AI, which targets JavaScript-heavy pages.

How does HTTPx + Parsel compare to Scrapy?

Scrapy is a full framework. It ships with crawling orchestration, request scheduling, rate limiting, and data pipelines. HTTPx plus Parsel gives you the request and extraction pieces only, so you assemble the rest yourself. Reach for this stack when you want full control over a small or async-heavy job and dislike Scrapy's conventions. Reach for Scrapy when you need large-scale crawling with retries and concurrency handled for you.

When should I choose HTTPx + Parsel over Beautiful Soup?

Beautiful Soup is a forgiving HTML parser usually paired with the requests library, and it favors readability over speed. HTTPx plus Parsel fits when you want native async requests, HTTP/2, and lxml-backed CSS and XPath selectors that match Scrapy's extraction syntax. It is the lighter choice for developers who are comfortable building their own crawling and rate-limiting layer instead of relying on a framework to provide one.

Weekly briefing – tool launches, legal shifts, market data.

Visit

HTTPx + Parsel

Visit →