serp.fast

How to Scrape X (Twitter) – Tools and Approach (2026)

Independent guide to extracting public X (Twitter) data after the API pricing changes. Tool options, the 2023–2026 access landscape, and what AI builders should actually do.

Nathan Kessler··Reviewed
5 min read·Difficulty: hard

Some links on this page are affiliate links. We earn a commission if you sign up – at no additional cost to you. Our editorial assessment is independent and never paid. How we review.

Legal note

X (formerly Twitter) prohibits scraping in its Terms of Service and has pursued legal action against scrapers since the 2023 platform changes. The official API is technically the cleanest path but pricing tiers (Basic at $200/month, Pro at $5,000/month, Enterprise from $42,000/month) put production access out of reach for most independent builders. Scraping public timeline data sits in a contested but commonly-practiced gray zone – most production scrapers route through commercial platforms and accept that X invests heavily in detection.

X – the platform formerly known as Twitter – has been the most volatile scraping target since the 2023 ownership change. API pricing increased dramatically, technical defenses intensified, login walls came and went, and what worked last quarter often does not work this quarter. For AI builders who need real-time public discussion data – sentiment monitoring, news detection, brand tracking, social grounding for AI agents – X remains essential despite the friction. This guide covers what is currently workable, the tools that handle it, and the cost structure for production use.

Why scrape X

The use cases divide along volume and freshness lines. Real-time monitoring – brand mentions, news breaks, market sentiment – needs continuous low-latency access to recent posts on specific keywords or accounts. Bulk analysis and training – building datasets for AI training, studying discourse patterns, retrospective campaign analysis – wants high-volume historical access. Profile and account research – identifying influencers, mapping community structure, verifying account authenticity – needs targeted profile-level data. AI grounding – pulling current discussion into a model's context for queries about news, products, or public figures – needs query-time access at unpredictable cadence.

Each use case has shifted differently since the API pricing changes. Real-time monitoring and AI grounding got dramatically more expensive on the official path. Bulk training-data collection became almost impossible through the API and pushed toward scraping or licensed providers. Profile research stayed roughly the same in operational shape but the per-record cost rose.

The 2023–2026 access landscape

The API tiers as of 2026 are: Free (100 reads per month, 1,500 posts per month, useful only for testing), Basic at $200 per month (10,000 tweet reads, 50,000 tweet posts, useful for very small monitoring use cases), Pro at $5,000 per month (1 million tweet reads, 300,000 tweet posts, useful for serious monitoring and small AI products), and Enterprise starting around $42,000 per month with custom volume tiers.

For comparison, the pre-2023 standard developer tier was free for non-commercial use and supported significantly higher read volumes. The API pricing change pushed essentially all small-and-medium AI builders off the official path. The market response was a cluster of scraping-based and licensed-data-based alternatives, plus a smaller cohort of builders who paid Pro tier and stayed on the API.

The technical anti-scraping changes paralleled the pricing changes. Rate limits on logged-out browsing tightened. Login walls appeared on search, lists, and some profile features. Account-required browsing became the default for several months in mid-2023, was partially rolled back in late 2023, and has shifted intermittently since. Headless browser fingerprinting and behavioral analysis were added or strengthened. The maintenance burden for X scrapers became continuous – tools that worked in Q1 may need updates in Q3.

Technical challenges

X's anti-bot stack is sophisticated and actively maintained. Rate limiting is aggressive on logged-out IP ranges. Login walls trigger unpredictably on previously public surfaces. Browser fingerprinting catches headless automation tells. Behavioral analysis flags sustained sequential requests. CAPTCHA frequency is high.

Beyond access, X's content delivery is heavily JavaScript-driven. Tweets render through client-side rendering with progressive loading and infinite-scroll patterns. Plain HTTP scraping rarely returns useful content. Workable extraction requires real browser rendering and often requires authenticated sessions, which adds account management complexity – managing a pool of logged-in accounts that survive X's automated detection of bot accounts.

Layout and DOM volatility is constant. X's product team ships changes frequently and uses extensive A/B testing, exposing different layouts to different visitors simultaneously. Selector-based scrapers break often. LLM-based extraction is more robust but adds per-tweet cost.

Tool recommendations

Four tools handle X scraping competently in the post-2023 environment.

Apify runs the deepest catalog of X-specific actors. Pre-built actors cover profile scraping, tweet search, thread expansion, follower lists, and trend monitoring. Each is maintained as X changes layouts and defenses. Pricing per actor run plus compute units typically lands in the $3–10 per 1,000 tweets range. Choose Apify when you want the fastest working path and your volume is in the thousands-to-low-millions of tweets per month.

Scrapfly is the strongest general-purpose option for X. Aggressive stealth tuning, residential proxies, and built-in handling for X's anti-bot layer. Pricing per request with multipliers for JS rendering. Choose Scrapfly when X is one target in a broader pipeline and you have the engineering capacity to write the parsing layer.

ZenRows offers a similar profile to Scrapfly – JS rendering, anti-bot bypass, stealth-tuned residential proxies. Either is a reasonable choice; the differentiator at small scale is mostly developer experience and pricing tier alignment.

Browserbase, Steel.dev, and Hyperbrowser become relevant when you need authenticated sessions – for example, a logged-in account viewing protected timelines or executing actions like follows and DMs. These tools provide managed browser infrastructure with persistent session state. Per-session-minute pricing is higher than dedicated scraping APIs, but the capability is unique. Choose browser infrastructure when scraping APIs cannot reach the data, or when you are running agent-style workflows that need to interact with X rather than just observe it.

For AI builders with very high volume or strict freshness requirements, evaluate licensed data resellers. Companies that paid for enterprise X data access during the pre-2023 era continue to resell it under various commercial terms. The economics often favor licensing over scraping at sustained scale of millions of tweets per day.

For brand and topic monitoring at moderate volume, Apify's tweet search actor is the path of least resistance. Schedule queries every 15–60 minutes depending on freshness needs. Cache deduplicated tweet IDs to avoid double-processing. Plan for 10–20% block rates during peak X traffic events.

For sentiment grounding inside an AI product, latency dominates. Use Scrapfly's fast-mode tweet endpoint and cache by tweet ID with a 30-minute TTL. Pre-warm caches for high-volume accounts and trending topics to keep query-time latency low.

For training-data corpus building, Apify works at moderate scale, but cost compounds quickly past a few million tweets per month. Evaluate licensed data resellers at that scale.

For account and profile research, Apify's profile actors handle most needs cleanly. Volumes are usually low enough that per-record pricing is acceptable.

The cross-cutting reality is that X data access is materially harder, more expensive, and more legally contested than it was before 2023. AI builders who depend on X data should plan for ongoing operational maintenance, budget for higher per-record costs than for most other targets, and design their products to gracefully degrade when X access is interrupted by platform changes.

Frequently asked

Can I just use the X API instead of scraping?
You can, and for some use cases the API is the right choice – sentiment monitoring, branded mention tracking, basic search. But pricing has shifted dramatically. The free tier is read-only and limited to 100 reads per month. Basic at $200/month gives 10,000 tweet reads. Pro at $5,000/month gives 1 million reads. Most AI builders find the API economics prohibitive for any non-trivial use case and end up scraping or working through a third-party data provider.
What changed with X scraping after Elon Musk's acquisition?
Three major shifts: API pricing increased by 100x or more on most tiers, technical anti-scraping measures intensified (rate limiting, login-walls on previously public pages, account-required browsing), and legal action against scrapers became more aggressive. The combination pushed most production scraping to either commercial scraping APIs or licensed data providers like Brightdata.
Are X posts behind a login wall now?
Partially. X has rolled out login walls on browse-while-logged-out flows, then partially rolled them back, and the policy changes frequently. Public profile pages and individual tweet URLs are usually viewable without login but are aggressively rate-limited. Search and timeline features typically require login. The practical effect: scraping requires either logged-in sessions (which adds account management complexity) or strict throttling on the still-public surfaces.
Is there a cheap way to get X data?
Apify's X actors and direct scraping through Scrapfly are usually the most cost-effective paths for non-API access. Expect $2–10 per 1,000 tweets depending on whether you need profile, search, or thread data. For very high volume use cases, evaluate licensed data providers (Brightdata, GNIP-style firehose resellers) – at sufficient scale, licensing is often cheaper than scraping plus the legal risk.

Weekly briefing — tool launches, legal shifts, market data.