← Back to projects
X Profile Scanning Pipeline project cover
Data Pipeline X Profile Scanning Pipeline

Data Pipeline

X Profile Scanning Pipeline

Built a Playwright pipeline that scanned 20,509 X profiles in 55 hours and filled a spreadsheet with verification status, follower counts, and region data.

The team got a usable creator list and a cleaned database.

PythonPlaywrightopenpyxlAutomation Research
20,509 Profiles
55 hours Runtime
395 found Verified

Problem

What needed to be solved.

A Web3 marketing team had more than 20,000 X usernames and needed three things for each one: verification status, follower count, and region. Manual checking would have taken weeks.

The hard part was not opening one profile. It was keeping a long run alive, saving progress, and ending with a spreadsheet the team could use right away.

Approach

How the system was framed.

I used Playwright instead of the paid X API because this was a one-time job and accuracy mattered more than speed.

The pipeline saved progress every 25 profiles, added random delays, retried failures, and skipped media downloads so the run could stay stable on a normal laptop.

Build Details

Architecture, tooling, and operating logic.

  • Checkpoint files so the run could restart without losing progress.
  • Random delays and timed pauses to reduce rate-limit risk.
  • Retry logic for broken pages and temporary failures.
  • Spreadsheet output with filtered views for campaign use.

Results

Operational outcome.

  • 20,509 profiles scanned in one unattended run.
  • 395 verified accounts found and 10,192 dead accounts flagged.
  • The client saved outreach time because the scan cleaned the list while enriching it.