Skip to main content

How to Access Historical Twitter Data: Search Old Tweets via API

Accessing historical Twitter data - tweets, replies, quotes, and engagement metrics going back years - is essential for market research, academic studies, OSINT investigations, competitive analysis, and trend tracking. Whether you need to analyze public discourse around a 2020 event, audit a brand’s entire posting history, or study how sentiment shifted over time on a topic, you need reliable access to Twitter’s full archive of public posts. This guide explains how to retrieve historical tweets programmatically using Sorsa API, which provides full-archive search access to every public post on X (formerly Twitter) dating back to 2006. No OAuth, no developer portal approval, no academic track application - just an API key and a POST request. You will learn how to search old tweets by keyword with date ranges, scrape a user’s complete timeline, filter historical content by engagement, and export the results for analysis.

Current Landscape: How to Get Old Tweets in 2026

Before diving into implementation, it helps to understand the available options and their tradeoffs:
MethodCostArchive DepthSetupData CompletenessBest For
X Advanced Search (web)FreeBack to 2006NoneLow (no export, manual scrolling)Quick spot checks
X API Full-Archive Search$5,000+/moBack to 2006High (OAuth, approvals)High (with field config)Funded research teams
Snscrape / scrapersFreeVariesMedium (coding, proxies)Unreliable (anti-bot)Budget experiments
Academic archives (TweetSets)FreeEvent-specificMedium (hydration needed)Partial (dehydrated IDs)Specific event datasets
Sorsa APIPay-per-useBack to 2006Low (API key only)High (full data by default)Production pipelines, research
The official X API’s Full-Archive Search (/2/tweets/search/all) is powerful but requires the Pro tier ($5,000+/month) or Enterprise access. The Academic Research track that once provided free full-archive access has been discontinued for new applicants. Even with access, the v2 API returns minimal data by default (just id and text) - you must explicitly request tweet.fields, user.fields, and expansions to get engagement metrics and author profiles. Sorsa API eliminates these barriers: full-archive access with a single API key, flat JSON responses that include all tweet fields and author data by default, and simple POST-based endpoints that work the same way whether you are searching tweets from yesterday or from 2012.

Two Endpoints for Historical Retrieval

Sorsa offers two paths depending on whether you are searching by keyword or scraping a specific account’s timeline.

Keyword-Based Archive Search: /search-tweets

The /v3/search-tweets endpoint supports all X search operators, including since: and until: date filters. Pass your date-bounded query in the JSON body:
POST https://api.sorsa.io/v3/search-tweets
{
  "query": "\"climate change\" since:2015-06-01 until:2015-12-31 lang:en min_faves:10",
  "order": "latest"
}
This returns tweets matching “climate change” posted between June and December 2015, in English, with at least 10 likes, sorted chronologically.

Account Timeline Scraping: /user-tweets

The /v3/user-tweets endpoint returns a user’s complete posting history in reverse chronological order (newest first). Paginate until next_cursor returns null to capture the full archive.
POST https://api.sorsa.io/v3/user-tweets
{
  "link": "https://x.com/elonmusk"
}

What You Can Retrieve

Every historical tweet comes with the same rich data as a recent one: full text (no truncation), all engagement metrics at their current values (likes, retweets, replies, quotes, views, bookmarks), the complete author profile embedded in each tweet, media entities with links to images/videos/GIFs, conversation metadata (conversation_id_str, in_reply_to_tweet_id, is_reply, is_quote_status), and language tags.

What Is Not Accessible

These are platform-level limitations, not Sorsa-specific:
  • Deleted tweets. Removed from X’s search index entirely. No API can retrieve them.
  • Protected accounts. Tweets from accounts with “Protect your posts” enabled are excluded from all search and timeline results.
  • Historical profile snapshots. Profile data reflects the current state (current bio, username, follower count), not past versions.
  • Historical engagement snapshots. Metrics show current totals, not what they were at a specific point in time. A tweet from 2018 shows its 2026 like count.

Code Example: Time-Range Keyword Search (Python)

Search for tweets about a specific topic within a defined historical window. This is the most common pattern for event analysis, campaign retrospectives, and academic research.
import requests
import time

API_KEY = "YOUR_API_KEY"
URL = "https://api.sorsa.io/v3/search-tweets"

def search_historical(query, max_pages=10):
    """Search the full tweet archive with automatic pagination."""
    all_tweets = []
    next_cursor = None

    for page in range(max_pages):
        body = {"query": query, "order": "latest"}
        if next_cursor:
            body["next_cursor"] = next_cursor

        resp = requests.post(
            URL,
            headers={"ApiKey": API_KEY, "Content-Type": "application/json"},
            json=body,
        )
        resp.raise_for_status()
        data = resp.json()

        tweets = data.get("tweets", [])
        all_tweets.extend(tweets)
        print(f"Page {page + 1}: {len(tweets)} tweets (total: {len(all_tweets)})")

        next_cursor = data.get("next_cursor")
        if not next_cursor:
            break
        time.sleep(0.1)

    return all_tweets


# Search for SpaceX tweets from June 2015
tweets = search_historical('SpaceX since:2015-06-01 until:2015-07-01 lang:en')

for t in tweets[:5]:
    print(f"[{t['created_at']}] @{t['user']['username']}")
    print(f"  {t['full_text'][:120]}...")
    print(f"  Likes: {t['likes_count']} | RTs: {t['retweet_count']}\n")

Code Example: Full Account Timeline Scraping (Python)

Walk through an account’s entire posting history from newest to oldest. Useful for competitor audits, influencer analysis, or archiving a public figure’s complete output.
import requests
import time

API_KEY = "YOUR_API_KEY"
URL = "https://api.sorsa.io/v3/user-tweets"

def scrape_full_timeline(username, max_pages=100):
    """Scrape a user's complete tweet history via /user-tweets."""
    all_tweets = []
    next_cursor = None

    for page in range(max_pages):
        body = {"link": f"https://x.com/{username}"}
        if next_cursor:
            body["next_cursor"] = next_cursor

        resp = requests.post(
            URL,
            headers={"ApiKey": API_KEY, "Content-Type": "application/json"},
            json=body,
        )
        resp.raise_for_status()
        data = resp.json()

        tweets = data.get("tweets", [])
        all_tweets.extend(tweets)

        if tweets:
            oldest = tweets[-1]["created_at"]
            print(f"Page {page + 1}: {len(tweets)} tweets (total: {len(all_tweets)}) | oldest: {oldest}")

        next_cursor = data.get("next_cursor")
        if not next_cursor:
            print("Reached end of timeline.")
            break
        time.sleep(0.1)

    return all_tweets


tweets = scrape_full_timeline("naval", max_pages=50)
print(f"\nCollected {len(tweets)} tweets from @naval's full history.")

Code Example: Historical Search with Engagement Filter (JavaScript)

Find high-engagement historical content on any topic. Ideal for content research - discovering what resonated with audiences in past years.
const API_KEY = "YOUR_API_KEY";
const URL = "https://api.sorsa.io/v3/search-tweets";

async function searchHistorical(query, maxPages = 10) {
  const allTweets = [];
  let nextCursor = null;

  for (let page = 0; page < maxPages; page++) {
    const body = { query, order: "popular" };
    if (nextCursor) body.next_cursor = nextCursor;

    const resp = await fetch(URL, {
      method: "POST",
      headers: { "ApiKey": API_KEY, "Content-Type": "application/json" },
      body: JSON.stringify(body),
    });

    if (!resp.ok) throw new Error(`API error: ${resp.status}`);
    const data = await resp.json();

    const tweets = data.tweets || [];
    allTweets.push(...tweets);
    console.log(`Page ${page + 1}: ${tweets.length} tweets (total: ${allTweets.length})`);

    nextCursor = data.next_cursor;
    if (!nextCursor) break;
    await new Promise((r) => setTimeout(r, 100));
  }

  return allTweets;
}

// Find viral Tesla tweets from 2019
(async () => {
  const tweets = await searchHistorical(
    'Tesla since:2019-01-01 until:2019-12-31 min_retweets:1000 -filter:nativeretweets lang:en'
  );

  for (const t of tweets.slice(0, 5)) {
    console.log(`@${t.user.username}: ${t.full_text.slice(0, 100)}...`);
    console.log(`  Likes: ${t.likes_count} | Views: ${t.view_count}\n`);
  }
})();

Exporting Historical Data to CSV

A complete pipeline for collecting historical tweets and writing them to a CSV file ready for Excel, Google Sheets, Pandas, or any BI tool.
import requests
import time
import csv

API_KEY = "YOUR_API_KEY"
URL = "https://api.sorsa.io/v3/search-tweets"

def export_historical_to_csv(query, output_file="historical_tweets.csv", max_pages=50):
    """Search historical tweets and export to CSV."""
    fields = [
        "tweet_id", "created_at", "full_text", "lang",
        "likes", "retweets", "replies", "quotes", "views",
        "author_id", "username", "display_name", "followers", "verified",
    ]

    with open(output_file, "w", newline="", encoding="utf-8") as f:
        writer = csv.DictWriter(f, fieldnames=fields)
        writer.writeheader()
        next_cursor, total = None, 0

        for page in range(max_pages):
            body = {"query": query, "order": "latest"}
            if next_cursor:
                body["next_cursor"] = next_cursor

            resp = requests.post(URL, headers={"ApiKey": API_KEY, "Content-Type": "application/json"}, json=body)
            resp.raise_for_status()
            data = resp.json()

            for t in data.get("tweets", []):
                u = t.get("user", {})
                writer.writerow({
                    "tweet_id": t["id"], "created_at": t["created_at"],
                    "full_text": t["full_text"], "lang": t.get("lang", ""),
                    "likes": t.get("likes_count", 0), "retweets": t.get("retweet_count", 0),
                    "replies": t.get("reply_count", 0), "quotes": t.get("quote_count", 0),
                    "views": t.get("view_count", 0), "author_id": u.get("id", ""),
                    "username": u.get("username", ""), "display_name": u.get("display_name", ""),
                    "followers": u.get("followers_count", 0), "verified": u.get("verified", False),
                })
                total += 1

            next_cursor = data.get("next_cursor")
            print(f"Page {page + 1} done. Total: {total} tweets.")
            if not next_cursor:
                break
            time.sleep(0.1)

    print(f"\nExport complete: {total} tweets saved to {output_file}")


# Bitcoin tweets from 2020, English, 50+ likes, no retweets
export_historical_to_csv(
    query='bitcoin since:2020-01-01 until:2020-12-31 lang:en min_faves:50 -filter:retweets',
    output_file="bitcoin_2020.csv",
    max_pages=100,
)

Strategies for Large-Scale Historical Collection

Chunk Large Date Ranges into Monthly Windows

Collecting data across a full year in a single query gives you no control over volume per batch and no clean resume point if something fails. Break the range into monthly chunks instead:
import calendar

def generate_monthly_chunks(year):
    """Generate (since, until) date pairs for each month."""
    chunks = []
    for month in range(1, 13):
        since = f"{year}-{month:02d}-01"
        until_month = month + 1 if month < 12 else 1
        until_year = year if month < 12 else year + 1
        until = f"{until_year}-{until_month:02d}-01"
        chunks.append((since, until))
    return chunks


for since, until in generate_monthly_chunks(2020):
    query = f'"climate change" since:{since} until:{until} lang:en min_faves:10'
    export_historical_to_csv(query, output_file=f"climate_{since[:7]}.csv", max_pages=50)

Filter Retweet Noise with -filter:nativeretweets

Historical searches often return a flood of retweets that bury original content. Adding -filter:nativeretweets ensures you only see original posts - the actual opinions, analysis, and commentary from real users. This is critical for sentiment analysis and content research.

Combine Engagement Filters with Date Ranges

Pair time filters with engagement thresholds to surface only the tweets that gained real traction during the window you care about:
"product launch" since:2022-03-01 until:2022-03-31 min_faves:100 -filter:retweets lang:en

Use Language Filters for Global Event Research

When researching events with worldwide coverage, always specify lang:xx in your query. Run separate queries per language for cleaner analysis:
"world cup" since:2022-11-20 until:2022-12-19 lang:en min_faves:50
"world cup" since:2022-11-20 until:2022-12-19 lang:es min_faves:50
"world cup" since:2022-11-20 until:2022-12-19 lang:pt min_faves:50

Paginate Until the Cursor Is Empty

The cursor-based pagination loop should stop only when next_cursor is null, empty, or absent from the response. Do not stop based on the number of tweets in a single page - some pages may return fewer results without indicating the end. Always check the cursor explicitly. For the full pagination pattern, see Pagination.

Next Steps

  • How to Search Tweets via API - complete guide to /search-tweets including all parameters and real-time use cases.
  • Search Operators Reference - full dictionary of date, engagement, media, and Boolean operators you can use in historical queries.
  • Pagination - deep dive into cursor-based pagination for large-scale collection.
  • Real-Time Monitoring - combine historical backfill with live monitoring for complete coverage.
  • Search Mentions Guide - track how often an account has been mentioned over any historical period.
  • API Reference - full specification for /user-tweets, /search-tweets, and all 38 endpoints.