What's the difference between a headless browser and a regular browser?

Same engine, no GUI. A headless browser runs the same JavaScript, makes the same network requests, and renders the same DOM as the regular version — it just doesn't open a visible window. To websites, the requests look identical at the network layer. To developers, you control it via API instead of clicking buttons.

When should I use a headless browser instead of curl or Python requests?

Only when JavaScript rendering is required — the page is empty without JS running, or the data appears after fetch/XHR after page load, or you need to interact (click, scroll, fill). For static HTML, plain HTTP wins by 100x on memory and 50x on speed. Check with curl first; reach for headless only when needed.

Which is better: Playwright or Puppeteer?

Playwright in 2026 for most use cases. Cleaner API, better cross-browser support (Chromium, Firefox, WebKit), and Microsoft-backed active development. Puppeteer is fine if you only need Chromium and are already using its API. The performance is similar; Playwright wins on developer experience.

Can websites detect headless browsers?

Yes, but it's a cat-and-mouse game. Detection signals include navigator.webdriver flag, missing browser features, default 800x600 viewport, empty plugin list, and lack of mouse movement before clicks. Stealth plugins (playwright-stealth, undetected-chromedriver) patch most of these. For toughest targets, pair with antidetect browsers or LTE mobile proxies.

How much memory does a headless browser use?

~200-300 MB per active instance for Chromium, similar for Firefox. WebKit is slightly lighter (~150 MB). For comparison: plain Python requests uses ~2 MB per concurrent connection. A headless browser is 100-200x heavier than an HTTP client — only use it when necessary.

Can I run a headless browser on a server without a display?

Yes — that's the main point of headless mode. No X server, no GUI dependencies, runs fine in Docker containers and CI pipelines. For full Chrome headless: docker run mcr.microsoft.com/playwright/python:v1.42.0-jammy. Most cloud functions (AWS Lambda, GCP Cloud Run) support headless Chrome with the right setup.

Do I need a headless browser to scrape JavaScript-heavy sites?

Often yes, but not always. Check first: open DevTools → Network tab, look for the XHR/fetch request that loads the data. If you can replay that request directly with requests/curl, you don't need a browser. About 50% of 'JS-heavy' sites have a discoverable API endpoint. The other 50% genuinely need a browser.

What's the cheapest way to run headless browsers at scale?

Local: dedicated VMs (DigitalOcean, Hetzner) running Playwright in Docker. ~$10-20/month per VM handling 1-2 pages/second per worker. Managed: Browserless.io, Bright Data Scraping Browser, or Apify offer hosted headless Chrome at ~$0.50-2 per 1K pages. At >5M pages/month, self-hosted wins on cost; below that, managed wins on convenience.

What Is a Headless Browser? Complete 2026 Guide

Daniel K.

Sun May 10 2026

Quick verdict: A headless browser is a real web browser (Chrome, Firefox, Safari) running without a graphical window. Same JavaScript engine, same DOM, same network stack — you just cannot see the page. Used for testing (run E2E tests in CI), scraping (handle JavaScript-rendered sites), automation (fill forms, take screenshots, generate PDFs), and security research. Main tools in 2026: Playwright, Puppeteer, Selenium. Most scrapers should default to HTTP clients (requests, httpx) and only reach for headless when JavaScript rendering is required — headless browsers use 100-200x the memory of plain HTTP.

The Definition

A normal browser launches a GUI window, renders pages, and accepts user input. A headless browser does everything except the window:

Same browser engine (Blink for Chrome, Gecko for Firefox, WebKit for Safari)
Same JavaScript runtime (V8, SpiderMonkey, JavaScriptCore)
Same DOM, CSS rendering, layout calculations
Same network stack including HTTP/2, HTTP/3, TLS
Same web APIs (fetch, WebSocket, Service Workers, IndexedDB)
No visible window, no need for a desktop environment, no GPU compositing required

From the destination server's perspective, a request from a headless Chrome looks identical to one from regular Chrome — same User-Agent, same TLS fingerprint, same HTTP/2 settings. Modern detection systems try to distinguish them via JavaScript checks (presence of WebDriver flag, navigator properties), but the network layer is identical.

Why Headless Browsers Exist

Three driving use cases:

1. Web Scraping (JavaScript-rendered pages)

Many modern sites are Single Page Applications — the initial HTML is mostly empty, content gets injected by JavaScript after page load. curl https://example.com returns the empty shell; a headless browser runs the JS and exposes the populated DOM.

Examples: React/Vue admin dashboards, Twitter feeds, Reddit listings, modern e-commerce sites.

2. Automated Testing

End-to-end tests need to simulate real user interactions in a real browser. Running browsers headlessly lets CI servers run tests without a display:

# GitHub Actions example
jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: npx playwright test  # runs headless by default in CI

3. Automation Tasks

Anything that needs a real browser but is not a user-facing flow:

Generating PDFs from HTML (server-side reports)
Screenshotting pages at scale (preview thumbnails, monitoring)
Filling government / banking portals that lack APIs
Security scanning (looking for XSS, CSRF, broken auth)

The 3 Main Tools in 2026

Tool	Maker	Browsers	Language	Best for
Playwright	Microsoft	Chromium, Firefox, WebKit	Python, Node, Java, .NET	Modern default; cross-browser
Puppeteer	Google	Chromium (some Firefox)	Node (community Python: pyppeteer)	Chrome-specific work
Selenium	Open source	All major	Python, Java, Ruby, C#, JS	Legacy compatibility, broad browser support

Playwright is the modern default — cleaner API, faster, more reliable cross-browser support. Puppeteer is fine if you only need Chromium. Selenium is the legacy choice with the largest ecosystem but slower and clunkier than the alternatives.

Code Example: Playwright in Python

pip install playwright
python -m playwright install chromium

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch(headless=True)
    page = browser.new_page()
    page.goto("https://example.com/spa-page")
    page.wait_for_selector("div.content-loaded")
    title = page.title()
    html = page.content()
    browser.close()

print(title)

That launches headless Chromium, navigates to a SPA, waits for client-side content to render, then captures the title and HTML. Same code works headed by passing headless=False — useful for debugging.

The Cost: Memory + Speed

Headless browsers are 100-200x heavier than HTTP clients:

Approach	Memory per request	Pages/second per worker
Plain Python requests	~2 MB	100+ (network-bound)
aiohttp / httpx async	~2 MB	500+ (concurrent)
curl_cffi (TLS-impersonated requests)	~3 MB	100+
Headless Chrome via Playwright	~250 MB	1-2 (CPU-bound)
Headless Firefox via Playwright	~300 MB	1-2

The takeaway: only use a headless browser when JavaScript rendering is REQUIRED. For static HTML, plain HTTP wins by orders of magnitude.

When You Do NOT Need Headless

Site returns the data you need in the initial HTML response — check with curl
Site has a JSON API endpoint — check Network tab in DevTools
Site sends data via XHR/fetch after page load that you can replay directly — intercept and reuse the request
Mobile API endpoints often expose the same data without JS — check m.target.com

Many scrapers reach for Playwright by default. Often the right answer is "spend 30 min in DevTools finding the real endpoint and use plain HTTP" — faster, lighter, and more resilient.

When You DO Need Headless

Site uses heavy client-side rendering and the data only appears after JS runs
Site requires interaction (click "load more," scroll, fill form) before showing data
Site fingerprints heavily and only a real browser passes (Cloudflare Turnstile, DataDome challenges)
You need a screenshot or PDF rendering of the final page
You are testing an actual user flow, not just data extraction

Headless + Proxies

Most scraping with headless browsers needs proxies:

browser = p.chromium.launch(
    headless=True,
    proxy={
        "server": "http://gw.spyderproxy.com:8000",
        "username": "USER",
        "password": "PASS",
    },
)

For sites with strong anti-bot, Premium Residential ($2.75/GB) is the default. For account-based workflows, LTE Mobile ($2/IP) has the lowest detection rate. Plain datacenter IPs are blocked instantly by most sites that justify a headless browser in the first place.

Headless Browser Detection

Modern anti-bot systems try to distinguish headless from regular browsers:

navigator.webdriver is true — biggest tell. Override with stealth plugins.
Missing browser features — some headless modes lack notifications, permissions APIs.
Unusual screen dimensions — the headless default is 800x600; real browsers are typically 1920x1080+.
Plugin list is empty — real browsers have at least PDF viewer.
Mouse / touch event patterns — if you script clicks without mouse movement, fingerprint detectors flag you.

For tough targets, use playwright-stealth, undetected-chromedriver (Selenium fork), or pair with an antidetect browser for maximum stealth.