Historical Data
Sorsa API provides full-archive access to public X (formerly Twitter) data back to March 2006. Historical retrieval uses the same endpoints, authentication, and pagination as recent data. There is no separate “full-archive” tier, no enterprise contract, and no time-window restriction on search. This page covers the two endpoints used for historical work, what the platform exposes versus what it doesn’t, and the patterns that hold up at scale.Note: For a fuller walkthrough with a method comparison table, a CSV export pipeline, and additional code examples, see Historical Twitter Data: How to Search Old Tweets via API on the blog.
Endpoints
| Endpoint | Use case | Pagination | Page size |
|---|---|---|---|
POST /v3/search-tweets | Keyword-based archive search with date and engagement filters | next_cursor | ~20 tweets |
POST /v3/user-tweets | A specific account’s complete posting history | next_cursor | ~20 tweets |
since:, until:, from:, to:, min_faves:, min_retweets:, lang:, and filter: directives.
Keyword Archive Search
Use/search-tweets when you need every tweet matching a query within a date window, across all users. Pass the date-bounded query in the JSON body:
order accepts "latest" (chronological, default for time-bounded queries) or "popular" (engagement-ranked, better for content research).
Full Account Timeline
Use/user-tweets when you want the complete posting history of one account, oldest to newest, without a 3,200-tweet cap.
next_cursor until it returns null. The endpoint walks the timeline in reverse chronological order.
What You Can Retrieve
Every historical tweet returns with the same field set as a recent one:- Full text (no truncation, no URL replacement)
- All six engagement metrics:
likes_count,retweet_count,reply_count,quote_count,view_count,bookmark_count - Embedded
userobject with the author’s full profile entitiesarray with media URLs (photos, videos, GIFs) and link previews- Conversation metadata:
conversation_id_str,in_reply_to_tweet_id,is_reply,is_quote_status - Language tag (
lang)
Platform-Level Limits
These are X-side restrictions, not Sorsa-specific. No public API can work around them.- Deleted tweets are removed from X’s search index and cannot be retrieved.
- Protected accounts are excluded from all public search and timeline results.
- Profile snapshots are not historical. A tweet from 2014 returns the author’s current bio, username, and follower count, not the 2014 values.
- Engagement metrics are not snapshots. Like, retweet, and view counts reflect current totals, not the counts as they stood on a specific past date. If you need point-in-time engagement, ingest tweets in real time via Real-Time Monitoring and store the metrics yourself.
Best Practices
Chunk Large Date Ranges
A single query across a multi-year window has no clean retry path and no per-period auditability. Split by month for year-scale collections, by week for volatile event windows.Filter Retweet Noise
Historical popular searches return waves of native retweets that bury original content. Add-filter:nativeretweets for sentiment, opinion, or content-pattern research. Use -filter:retweets to also exclude legacy RT @user: retweets.
Pair Engagement and Date Filters
Combiningsince: / until: with min_faves: or min_retweets: cuts noise and request volume sharply. Example:
Split Global Topics by Language
For worldwide events, separate queries perlang: give cleaner per-locale datasets than mixing languages.
Paginate Until the Cursor Is Empty
Terminate only whennext_cursor is null, empty, or absent. Don’t stop early on small page sizes. Full pattern in Pagination.
Related
- Search Tweets: endpoint reference for
/search-tweets - Search Operators: full operator dictionary
- Pagination: cursor-based pagination details
- Real-Time Monitoring: pair with historical backfill for forward-looking ingestion
- Track Mentions: historical mention tracking for any handle
- Optimizing API Usage: reduce request count on large collections