Optimizing API Usage: Deduplication, Batch Endpoints, and Request Efficiency
When you are building a production pipeline on top of the Sorsa API - scraping followers, collecting tweets, enriching profiles, verifying campaign actions - the difference between a well-architected integration and a naive one can be 10x in request volume for the same output. This guide covers the patterns and techniques that minimize wasted API calls while maximizing the data you extract from each request.Principle 1: Every Tweet Response Already Contains User Data
This is the single most important efficiency insight for working with Sorsa API. Every endpoint that returns tweets -/search-tweets, /user-tweets, /list-tweets, /comments, /quotes, /mentions - embeds the complete author profile inside each tweet object.
user object inside each tweet contains the same data that a dedicated /info call would return: ID, username, display name, bio, follower count, following count, tweet count, verified status, location, creation date, profile image, and more.
What this means in practice: If you search for tweets about a topic and want to build a list of users discussing it, you do not need to make separate /info calls for each author. The user data is already in the response. Extract it directly:
/info requests in a typical workflow.
Principle 2: Use Batch Endpoints When They Exist
Sorsa provides batch variants for the most common lookup operations. Using them instead of looping through the single-item endpoint reduces your request count dramatically./info-batch Instead of Looping /info
If you need profile data for multiple accounts, do not call /info in a loop. Use /info-batch to fetch them all at once:
/tweet-info-bulk Instead of Looping /tweet-info
If you have a list of tweet IDs (from an archive, a mention export, or a list of bookmarked links) and need their current engagement metrics and author data, use the bulk endpoint to hydrate up to 100 tweets in a single request:
Principle 3: Use /list-tweets Instead of Multiple /user-tweets Calls
If you are monitoring or collecting recent tweets from multiple accounts, do not poll each one individually. Add them to an X List and make a single /list-tweets request that returns the combined recent activity from all members.
Principle 4: Use /info as Your Universal Resolver
The /info endpoint accepts a username, user ID, or profile link - any format. It returns the full profile object including the permanent User ID. This makes it the most flexible endpoint for normalizing mixed inputs.
If you receive account references in different formats and need to resolve all of them to full profiles, use /info once per account instead of calling a conversion endpoint (/username-to-id, /link-to-id) followed by a separate /info call:
/username-to-id directly instead of fetching the full profile again.
Principle 5: Deduplicate in Your Database Layer
When you collect data from multiple sources - search results, follower lists, mention feeds, timeline scrapes - the same user will appear many times. Use the permanent User ID as your deduplication key and update existing records rather than inserting duplicates.Principle 6: Avoid Re-fetching Data You Already Have
When building multi-step pipelines, pass data forward between steps instead of re-fetching it. Example: Audience geography analysis. The workflow is: (1) fetch followers, (2) look up country via/about for each follower. After step 1, you already have the full profile object for every follower (bio, follower count, verified status). Do not call /info again in step 2 - you only need /about for the country data that is not in the standard profile.
Example: Campaign verification. When verifying follow + retweet + comment for a participant, the /check-comment response includes the full tweet object of the comment (when commented: true). If you need to analyze the comment text for quality, extract it from the verification response - do not make a separate /search-tweets or /comments call to find it again.
Example: Building a user list from search results. If you searched for tweets and collected 500 unique users from the embedded user objects, and now want to find which of them have 10K+ followers - filter the data you already collected. Do not call /info for each of the 500 users.
Quick Reference: Choosing the Right Endpoint
| You have… | You need… | Use this | Not this |
|---|---|---|---|
| A list of handles | Full profiles for all of them | GET /info-batch (one request) | /info in a loop |
| A list of tweet URLs | Full tweet data + author profiles | POST /tweet-info-bulk (up to 100/request) | /tweet-info in a loop |
| 30 accounts to monitor | Recent tweets from all of them | GET /list-tweets (one request) | /user-tweets x 30 |
| A handle, need the full profile | ID + bio + counts + everything | GET /info | /username-to-id then /info |
| Tweet search results | Author profiles | Extract tweet["user"] from response | /info for each author |
| Follower list | Profile data per follower | Already in /followers response | /info for each follower |
Estimating Your Request Budget
When planning a project, estimate your total request count before you start:| Task | Efficient approach | Requests needed |
|---|---|---|
| Profile data for 50 accounts | /info-batch | ~1 |
| 10,000 followers of an account | /followers with pagination (200/page) | 50 |
| Country data for those 10,000 followers | /about for each | 10,000 |
| 100 tweets hydrated with current metrics | /tweet-info-bulk | 1 |
| Monitor 50 accounts every 10 seconds for a day | /list-tweets | 8,640 |
| Verify 5 tasks for 1,000 campaign participants | 5 checks per user | 5,000 |
Next Steps
- Rate Limits and Best Practices - handling 429 errors and optimizing throughput.
- Pagination - cursor-based pagination patterns for large-scale extraction.
- API Reference - full specification for
/info-batch,/tweet-info-bulk, and all 38 endpoints.