spyderproxy

How to Scrape TikTok (2026): Methods, Tools, and Proxies That Work

A

Alex R.

|
Published date

2026-04-21

TikTok publishes roughly 34 million videos a day and sits on the most valuable real-time trend signal on the internet. Scraping it, however, is one of the harder targets in 2026 — the platform rotates request signatures every few weeks, applies aggressive IP reputation scoring, and shadow-bans accounts that behave like bots.

This guide walks through what actually works right now: which endpoints are reachable, how signatures are computed, which proxy types let you collect video metadata, profile stats, and trending hashtags without getting blocked, and the code patterns to use.

What You Can (Legally) Scrape From TikTok

TikTok's public-facing data — videos, profiles, sounds, hashtags, and trends that any logged-out user can see in a browser — is fair game for scraping from a legal standpoint in most jurisdictions (see the hiQ v LinkedIn line of cases in the US). You are not allowed to scrape private accounts, drafts, DMs, or anything behind authentication you don't own.

In practice, the common scraping targets are:

  • Video metadata — caption, music, hashtags, view/like/share/comment counts, upload timestamp
  • Profile pages — follower count, following count, bio, post count, verified status
  • Hashtag and sound feeds — top videos using a given tag or sound
  • Trending and Discover — the currently promoted tags and sounds
  • Comments — top-level comments on public videos
  • Search results — keyword search pages

Three Ways to Scrape TikTok

Method 1: The Web Site (tiktok.com)

The simplest path is to scrape tiktok.com as a regular browser. TikTok renders server-side initially, then hydrates via JavaScript. Some data is embedded in a <script id="__UNIVERSAL_DATA_FOR_REHYDRATION__"> blob on the initial HTML; the rest comes from XHR calls like /api/post/item_list/, /api/user/detail/, and /api/hashtag/info/.

The gotcha: every XHR request requires a set of signed query parameters — msToken, X-Bogus, and _signature — that are computed client-side in an obfuscated TikTok JavaScript file. Without them the API returns {"statusCode":10000} or empty data.

Method 2: The Mobile App API

TikTok's mobile app (m.tiktok.com and its internal api.tiktokv.com) exposes a richer, older API. Video data has more fields, rate limits are looser, and the endpoints are more stable over time. However, mobile API requests require a device_id, iid (install ID), and an X-Gorgon / X-Khronos signature — all computed by native code on a real phone. Replicating this involves either reverse-engineering the .so libraries or running an Android emulator with Frida hooks.

This path is faster and cleaner once working, but the upfront effort is large and each TikTok app update can invalidate your signing. Most scrapers stick with Method 1.

Method 3: Third-Party Scraping APIs

Services like ScrapingFish, ScrapingBee, Apify, and SerpApi sell HTTP endpoints that do the signing and proxy rotation for you. You pay per request ($1–$3 per 1,000 video-detail calls in 2026). For small-volume projects this is cheaper than building and maintaining your own stack.

Proxy Requirements (This Is Where Most Scrapers Fail)

TikTok's anti-bot stack scores every request on IP type, IP reputation, TLS fingerprint, request cadence, and account behavior history. Datacenter IPs are triaged into a harder queue the moment they're detected. What works:

  • Residential proxies — The standard for medium-volume scraping. The target sees a real household IP from a real ISP. SpyderProxy's Budget Residential at $1.75/GB and Premium Residential at $2.75/GB are both well suited; Premium adds 24-hour sticky sessions (useful if you're also scraping comments under a persistent login).
  • Mobile (LTE/4G) proxies — The gold standard. CGNAT means one IP is shared by thousands of real phones, so TikTok can't ban a specific mobile IP without also banning real users. If you're logging in to scrape personalized data or the FYP (For You Page) from a specific region, use LTE proxies at $2/IP.
  • Datacenter proxies — Only viable for non-personalized, unauthenticated scraping of public pages at low volume. TikTok flags the major cloud ASNs (AWS, GCP, Azure, DigitalOcean, Hetzner) within seconds.

Working Python Example — Scraping Video Metadata

Here's a minimal working pattern for scraping embedded data from a TikTok video URL through a residential proxy:

import httpx
import json
import re
from urllib.parse import urlparse

PROXY = "http://username:[email protected]:8000"

UA = (
    "Mozilla/5.0 (Windows NT 10.0; Win64; x64) "
    "AppleWebKit/537.36 (KHTML, like Gecko) "
    "Chrome/125.0.0.0 Safari/537.36"
)

def scrape_tiktok_video(url: str) -> dict:
    with httpx.Client(
        proxies=PROXY,
        headers={"User-Agent": UA, "Accept-Language": "en-US,en;q=0.9"},
        follow_redirects=True,
        timeout=30.0,
    ) as client:
        r = client.get(url)
        r.raise_for_status()
        m = re.search(
            r'<script id="__UNIVERSAL_DATA_FOR_REHYDRATION__"[^>]*>(.+?)</script>',
            r.text,
            re.DOTALL,
        )
        if not m:
            raise RuntimeError("rehydration blob not found — page may be blocked")
        data = json.loads(m.group(1))
        video = data["__DEFAULT_SCOPE__"]["webapp.video-detail"]["itemInfo"]["itemStruct"]
        return {
            "id": video["id"],
            "caption": video["desc"],
            "likes": video["stats"]["diggCount"],
            "views": video["stats"]["playCount"],
            "comments": video["stats"]["commentCount"],
            "shares": video["stats"]["shareCount"],
            "author": video["author"]["uniqueId"],
            "music": video["music"]["title"],
            "hashtags": [t["hashtagName"] for t in video.get("textExtra", []) if t.get("hashtagName")],
            "created_at": video["createTime"],
        }

print(scrape_tiktok_video("https://www.tiktok.com/@tiktok/video/7347838921385250603"))

Note: I'm parsing the static rehydration blob instead of calling the signed XHR endpoints. For single-video and single-profile lookups this works fine and dodges the signature problem entirely. For feeds, search, or infinite-scroll you need either the signed API path or a browser (Playwright/Puppeteer).

Scraping Profile + Feed With Playwright

When the rehydration blob is missing (feed pages, For You Page, hashtag infinite scroll), drive a real browser. This costs 10–20× more resources but sidesteps signing entirely:

from playwright.sync_api import sync_playwright

PROXY_SERVER = "http://proxy.spyderproxy.com:8000"
PROXY_USER = "username"
PROXY_PASS = "password"

def scrape_tiktok_hashtag(tag: str, max_videos: int = 50):
    videos = []
    with sync_playwright() as p:
        browser = p.chromium.launch(
            headless=True,
            proxy={
                "server": PROXY_SERVER,
                "username": PROXY_USER,
                "password": PROXY_PASS,
            },
        )
        page = browser.new_page()
        page.goto(f"https://www.tiktok.com/tag/{tag}", wait_until="networkidle")
        while len(videos) < max_videos:
            items = page.query_selector_all("[data-e2e='challenge-item']")
            for item in items[len(videos):]:
                a = item.query_selector("a")
                if a:
                    videos.append(a.get_attribute("href"))
            page.mouse.wheel(0, 3000)
            page.wait_for_timeout(2000)
        browser.close()
    return videos[:max_videos]

Detection and How to Stay Under It

The common failure modes and their fixes:

  • HTTP 403 or captcha — Your IP has been flagged. Rotate to a fresh residential or mobile IP. Don't retry on the same IP; TikTok will burn it permanently.
  • {"statusCode":10000} — Missing or invalid signature. Either update your signing library or switch to the rehydration-blob approach.
  • Empty itemList responses — Account-level shadowban, or you're hitting the API without a valid msToken cookie. Re-acquire the cookie by loading the home page fresh.
  • TLS fingerprint flaggedrequests and default httpx ship with a Python-specific TLS signature. Use curl_cffi, httpx with http2=True, or tls-client to mimic a real browser's ClientHello.
  • Rate limit "429" — You're going too fast from one IP. Slow the cadence or widen the rotation pool. TikTok's unauthenticated rate limits are roughly 200–500 requests/hour/IP from residential.

TikTok Research API (The Official One)

If your use case is academic or journalistic, TikTok's Research API provides structured access to public data without scraping. It's restricted to approved researchers at accredited institutions and caps out at 1,000 requests/day per app. For commercial work, it's not an option.

Legal and Ethical Guardrails

Scraping public TikTok data is generally lawful in the US and EU, subject to limits. Don't:

  • Scrape anything gated by login unless you own the account
  • Circumvent technical access controls (this triggers CFAA/DMCA concerns in the US)
  • Republish scraped content verbatim at scale without transformation (copyright)
  • Collect personal data of EU users without a GDPR lawful basis
  • Scrape minors' data — TikTok's under-18 users are covered by COPPA and similar laws worldwide

For most commercial analytics, trend research, and competitive intelligence use cases, public metadata is fine. Consult a lawyer if your use case involves republishing, personal data, or jurisdictions outside US/EU.

FAQs

Is it legal to scrape TikTok?

Scraping publicly accessible TikTok pages (videos, profiles, hashtags that don't require login) is generally lawful in the US and EU under current case law, with caveats. Scraping private accounts, DMs, or circumventing technical controls is not. TikTok's Terms of Service prohibit automated access, which gives them grounds to ban accounts and IPs, but ToS violations are not criminal.

Do I need a proxy to scrape TikTok?

Yes, for any non-trivial volume. TikTok rate-limits aggressively per IP and flags datacenter ranges quickly. Use residential proxies for general scraping and mobile (LTE) proxies for logged-in or personalized scraping.

Why do my TikTok scraping requests return empty results?

Most common causes: missing msToken cookie, invalid X-Bogus signature, flagged IP, or you're hitting a geo-restricted region. Start by scraping the rehydration blob from the HTML instead of calling signed XHR endpoints — it works without signing.

Can I scrape the TikTok For You Page?

Only with an authenticated session — the FYP is personalized per user. Use a real TikTok account, a mobile proxy to protect it from association bans, and Playwright or the mobile API. Expect heavy detection.

How much does it cost to scrape TikTok at scale?

Rough 2026 numbers: at 100k videos/month via residential proxy, budget ~$15–$30/month for bandwidth plus developer time. Via a commercial scraping API, the same volume runs $100–$300/month. DIY with mobile proxies for logged-in scraping: $200–$500/month for the proxy pool alone.

What's the difference between scraping tiktok.com and the mobile API?

tiktok.com is easier to reach but has stricter signing and fewer data fields. The mobile app API (api.tiktokv.com) returns richer video metadata and looser rate limits but requires replicating X-Gorgon signing from native mobile code — much higher upfront effort.

Can I scrape TikTok without writing signing code?

Yes, two options: parse the __UNIVERSAL_DATA_FOR_REHYDRATION__ blob from the HTML (works for individual videos/profiles), or drive a real browser via Playwright (works for everything but is slower). Both avoid re-implementing TikTok's signature logic.

How long do TikTok IPs stay flagged?

Our observation: residential IPs flagged for aggressive scraping cool off in 24–72 hours. Mobile IPs cool off faster (4–12 hours) because the IP serves many real users. Datacenter IPs once flagged stay flagged essentially forever for TikTok.

Bottom Line

Scraping TikTok in 2026 comes down to two choices: parse the rehydration blob + use residential proxies for most data, or drive a full browser + use mobile proxies for anything that needs an account. Budget $1.75–$2/GB on residential or $2/IP on mobile and plan for the signing API to rotate every few months.

SpyderProxy's Budget Residential and LTE Mobile pools are both battle-tested for TikTok scraping at scale. See also our guides on scraping LinkedIn and scraping Instagram, which share many of the same patterns.

Related Resources