TikTok publishes roughly 34 million videos a day and sits on the most valuable real-time trend signal on the internet. Scraping it, however, is one of the harder targets in 2026 — the platform rotates request signatures every few weeks, applies aggressive IP reputation scoring, and shadow-bans accounts that behave like bots.
This guide walks through what actually works right now: which endpoints are reachable, how signatures are computed, which proxy types let you collect video metadata, profile stats, and trending hashtags without getting blocked, and the code patterns to use.
TikTok's public-facing data — videos, profiles, sounds, hashtags, and trends that any logged-out user can see in a browser — is fair game for scraping from a legal standpoint in most jurisdictions (see the hiQ v LinkedIn line of cases in the US). You are not allowed to scrape private accounts, drafts, DMs, or anything behind authentication you don't own.
In practice, the common scraping targets are:
tiktok.com)The simplest path is to scrape tiktok.com as a regular browser. TikTok renders server-side initially, then hydrates via JavaScript. Some data is embedded in a <script id="__UNIVERSAL_DATA_FOR_REHYDRATION__"> blob on the initial HTML; the rest comes from XHR calls like /api/post/item_list/, /api/user/detail/, and /api/hashtag/info/.
The gotcha: every XHR request requires a set of signed query parameters — msToken, X-Bogus, and _signature — that are computed client-side in an obfuscated TikTok JavaScript file. Without them the API returns {"statusCode":10000} or empty data.
TikTok's mobile app (m.tiktok.com and its internal api.tiktokv.com) exposes a richer, older API. Video data has more fields, rate limits are looser, and the endpoints are more stable over time. However, mobile API requests require a device_id, iid (install ID), and an X-Gorgon / X-Khronos signature — all computed by native code on a real phone. Replicating this involves either reverse-engineering the .so libraries or running an Android emulator with Frida hooks.
This path is faster and cleaner once working, but the upfront effort is large and each TikTok app update can invalidate your signing. Most scrapers stick with Method 1.
Services like ScrapingFish, ScrapingBee, Apify, and SerpApi sell HTTP endpoints that do the signing and proxy rotation for you. You pay per request ($1–$3 per 1,000 video-detail calls in 2026). For small-volume projects this is cheaper than building and maintaining your own stack.
TikTok's anti-bot stack scores every request on IP type, IP reputation, TLS fingerprint, request cadence, and account behavior history. Datacenter IPs are triaged into a harder queue the moment they're detected. What works:
Here's a minimal working pattern for scraping embedded data from a TikTok video URL through a residential proxy:
import httpx
import json
import re
from urllib.parse import urlparse
PROXY = "http://username:[email protected]:8000"
UA = (
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) "
"AppleWebKit/537.36 (KHTML, like Gecko) "
"Chrome/125.0.0.0 Safari/537.36"
)
def scrape_tiktok_video(url: str) -> dict:
with httpx.Client(
proxies=PROXY,
headers={"User-Agent": UA, "Accept-Language": "en-US,en;q=0.9"},
follow_redirects=True,
timeout=30.0,
) as client:
r = client.get(url)
r.raise_for_status()
m = re.search(
r'<script id="__UNIVERSAL_DATA_FOR_REHYDRATION__"[^>]*>(.+?)</script>',
r.text,
re.DOTALL,
)
if not m:
raise RuntimeError("rehydration blob not found — page may be blocked")
data = json.loads(m.group(1))
video = data["__DEFAULT_SCOPE__"]["webapp.video-detail"]["itemInfo"]["itemStruct"]
return {
"id": video["id"],
"caption": video["desc"],
"likes": video["stats"]["diggCount"],
"views": video["stats"]["playCount"],
"comments": video["stats"]["commentCount"],
"shares": video["stats"]["shareCount"],
"author": video["author"]["uniqueId"],
"music": video["music"]["title"],
"hashtags": [t["hashtagName"] for t in video.get("textExtra", []) if t.get("hashtagName")],
"created_at": video["createTime"],
}
print(scrape_tiktok_video("https://www.tiktok.com/@tiktok/video/7347838921385250603"))
Note: I'm parsing the static rehydration blob instead of calling the signed XHR endpoints. For single-video and single-profile lookups this works fine and dodges the signature problem entirely. For feeds, search, or infinite-scroll you need either the signed API path or a browser (Playwright/Puppeteer).
When the rehydration blob is missing (feed pages, For You Page, hashtag infinite scroll), drive a real browser. This costs 10–20× more resources but sidesteps signing entirely:
from playwright.sync_api import sync_playwright
PROXY_SERVER = "http://proxy.spyderproxy.com:8000"
PROXY_USER = "username"
PROXY_PASS = "password"
def scrape_tiktok_hashtag(tag: str, max_videos: int = 50):
videos = []
with sync_playwright() as p:
browser = p.chromium.launch(
headless=True,
proxy={
"server": PROXY_SERVER,
"username": PROXY_USER,
"password": PROXY_PASS,
},
)
page = browser.new_page()
page.goto(f"https://www.tiktok.com/tag/{tag}", wait_until="networkidle")
while len(videos) < max_videos:
items = page.query_selector_all("[data-e2e='challenge-item']")
for item in items[len(videos):]:
a = item.query_selector("a")
if a:
videos.append(a.get_attribute("href"))
page.mouse.wheel(0, 3000)
page.wait_for_timeout(2000)
browser.close()
return videos[:max_videos]
The common failure modes and their fixes:
{"statusCode":10000} — Missing or invalid signature. Either update your signing library or switch to the rehydration-blob approach.itemList responses — Account-level shadowban, or you're hitting the API without a valid msToken cookie. Re-acquire the cookie by loading the home page fresh.requests and default httpx ship with a Python-specific TLS signature. Use curl_cffi, httpx with http2=True, or tls-client to mimic a real browser's ClientHello.If your use case is academic or journalistic, TikTok's Research API provides structured access to public data without scraping. It's restricted to approved researchers at accredited institutions and caps out at 1,000 requests/day per app. For commercial work, it's not an option.
Scraping public TikTok data is generally lawful in the US and EU, subject to limits. Don't:
For most commercial analytics, trend research, and competitive intelligence use cases, public metadata is fine. Consult a lawyer if your use case involves republishing, personal data, or jurisdictions outside US/EU.
Scraping publicly accessible TikTok pages (videos, profiles, hashtags that don't require login) is generally lawful in the US and EU under current case law, with caveats. Scraping private accounts, DMs, or circumventing technical controls is not. TikTok's Terms of Service prohibit automated access, which gives them grounds to ban accounts and IPs, but ToS violations are not criminal.
Yes, for any non-trivial volume. TikTok rate-limits aggressively per IP and flags datacenter ranges quickly. Use residential proxies for general scraping and mobile (LTE) proxies for logged-in or personalized scraping.
Most common causes: missing msToken cookie, invalid X-Bogus signature, flagged IP, or you're hitting a geo-restricted region. Start by scraping the rehydration blob from the HTML instead of calling signed XHR endpoints — it works without signing.
Only with an authenticated session — the FYP is personalized per user. Use a real TikTok account, a mobile proxy to protect it from association bans, and Playwright or the mobile API. Expect heavy detection.
Rough 2026 numbers: at 100k videos/month via residential proxy, budget ~$15–$30/month for bandwidth plus developer time. Via a commercial scraping API, the same volume runs $100–$300/month. DIY with mobile proxies for logged-in scraping: $200–$500/month for the proxy pool alone.
tiktok.com is easier to reach but has stricter signing and fewer data fields. The mobile app API (api.tiktokv.com) returns richer video metadata and looser rate limits but requires replicating X-Gorgon signing from native mobile code — much higher upfront effort.
Yes, two options: parse the __UNIVERSAL_DATA_FOR_REHYDRATION__ blob from the HTML (works for individual videos/profiles), or drive a real browser via Playwright (works for everything but is slower). Both avoid re-implementing TikTok's signature logic.
Our observation: residential IPs flagged for aggressive scraping cool off in 24–72 hours. Mobile IPs cool off faster (4–12 hours) because the IP serves many real users. Datacenter IPs once flagged stay flagged essentially forever for TikTok.
Scraping TikTok in 2026 comes down to two choices: parse the rehydration blob + use residential proxies for most data, or drive a full browser + use mobile proxies for anything that needs an account. Budget $1.75–$2/GB on residential or $2/IP on mobile and plan for the signing API to rotate every few months.
SpyderProxy's Budget Residential and LTE Mobile pools are both battle-tested for TikTok scraping at scale. See also our guides on scraping LinkedIn and scraping Instagram, which share many of the same patterns.