What is real-time fraud detection?

Real-time fraud detection scores transactions or sessions for fraud risk in under 200 milliseconds — fast enough to block the fraudulent action before it completes (a payment, account creation, login). Static rule-based fraud detection runs after the fact; real-time detection runs inline.

What signals matter most for real-time fraud detection?

Five signals dominate scoring: (1) IP reputation — is this IP on known abuse lists? (2) Device fingerprint — has this exact device been seen before, and was it linked to fraud? (3) Behavioral velocity — are typing speed, mouse movement, and request rates within human range? (4) Geolocation consistency — does the IP geo match the GPS, billing address, and shipping address? (5) Network type — is it residential, mobile, datacenter, or VPN?

How do fraud teams use proxies for detection?

Two ways: (1) attackers use proxies to mask identity, so detection systems must classify proxy type and reputation; (2) detection vendors use their OWN residential proxies to verify what real users see in real geographies — what an ad rendered as, what a checkout page actually displayed, what a fake review pattern looks like at scale.

Is real-time fraud detection ML or rules?

Both. Pure rules are brittle (attackers learn them); pure ML is opaque (compliance hates explaining a black box). Modern stacks combine: rules catch obvious cases (sanctions list, known fraud IPs), ML scores the gray-area cases. Decisioning is rules-then-ML-then-human-review.

How fast does real-time fraud detection need to be?

Under 100 ms for inline payment authorization. Under 500 ms for account creation and login. Anything slower than 1 second causes user friction that hurts conversion more than the fraud it catches. The hardest constraint is fetching all the signals (IP lookup, device fingerprint, behavioral data, third-party reputation) in parallel within that budget.

Why do fraud vendors run their own residential proxy networks?

To collect ground-truth web data: how does a target site actually render to a real user in Tokyo vs Phoenix? What does a fraud listing look like before it's removed? What products are appearing in price-fixing rings? Datacenter IPs are excluded from many target sites' answers, so fraud vendors need real residential views.

What's the difference between rule-based and behavioral fraud detection?

Rule-based fires on conditions: 'block if IP in country X' or 'flag if 5+ accounts share same device'. Behavioral fraud detection learns the typical user pattern: typing rhythm, mouse movement, scroll speed, time-of-day patterns. Anomalies score higher fraud risk. Behavioral catches sophisticated fraud rules can't (e.g., a real-looking session that's actually a script with humanlike random delays).

How does GDPR affect fraud detection?

GDPR Article 22 limits automated decisioning that produces 'legal or similarly significant effects' — including blocking transactions. Fraud detection vendors comply by (a) requiring human review for high-stakes decisions, (b) processing personal data under Article 6(1)(f) legitimate interest, and (c) offering data subjects the right to challenge automated decisions. CCPA has similar requirements in California.

Fraud Detection With Real-Time Data (2026)

Daniel K.

Fri May 01 2026

Quick verdict: Modern real-time fraud detection scores every transaction, login, or account creation against five signals — IP reputation, device fingerprint, behavioral velocity, geolocation consistency, and network type — in under 200 milliseconds. Fraud teams use residential proxies two ways: to classify proxies attackers use, and to fetch ground-truth web data from real geographies for verification. The hardest engineering constraint is gathering all five signals in parallel inside the latency budget.

This guide covers the architecture of a real-time fraud pipeline, the five signals that actually matter (with how to source each), the proxy infrastructure required, and the GDPR/CCPA constraints that shape every modern fraud system.

The Problem: Fraud Moves Faster Than Static Rules

Traditional fraud detection runs rules: "block if country = X", "flag if velocity > Y". Three problems:

Attackers learn the rules. Public block lists become roadmaps for which countries to spoof, which velocities to stay under.
Static rules don't catch behavioral anomalies. A fraudster using a clean US residential IP, a real device, and humanlike timing will pass every rule a static system checks.
By the time a rule fires retroactively, the fraud has succeeded. Chargebacks weeks later don't help.

Real-time detection collapses the cycle: score every event inline, in <200 ms, before the action completes.

The 5 Real-Time Data Signals That Matter

1. IP reputation

Is this IP on known abuse lists? Has it been used in past fraud? Is it a known anonymizer (VPN, Tor, datacenter)?

Sources: Spamhaus, Project Honey Pot, MaxMind GeoIP, IPHub, IP-API.
Latency: Sub-10 ms via cached lookups.
Failure mode: Brand-new compromised residential IPs aren't on any list yet — IP reputation alone catches roughly 60% of fraud, not 100%.

2. Device fingerprint

Same device used by 200 different accounts last week? Likely a bot farm. Same fingerprint hit your site twice in 5 seconds from different IPs? Likely a session-hijack attempt.

Inputs: ~40 browser/device attributes — User-Agent, screen resolution, timezone, fonts, canvas fingerprint, WebGL renderer, audio fingerprint.
Latency: Generated client-side, sent in request header. Lookup against your fingerprint store sub-20 ms.
Failure mode: Antidetect browsers (see our VM vs antidetect comparison) generate plausible random fingerprints. Defense requires comparing fingerprint stability over time.

3. Behavioral velocity

Real users type at 30–60 WPM, click with curved mouse paths, scroll at variable speeds. Bots have characteristic patterns: linear mouse paths, fixed-interval clicks, instant form fills.

Sources: Client-side JS instrumentation that captures keystroke timing, mousemove deltas, scroll velocity. Beacon to server.
Latency: Captured during the session; scored at decision point in <30 ms.
Failure mode: Sophisticated bots like Selenium with humanizer plugins can replicate the rough behavior. Detection requires deeper micro-pattern analysis.

4. Geolocation consistency

The IP geo says New York but the GPS says Lagos? Strong fraud signal. The shipping address is in California but the billing address card was issued in a Russian bank? Probable card-not-present fraud.

Sources: MaxMind for IP geo, GPS from mobile devices, billing/shipping from form, BIN lookup for card issuer country.
Latency: All sub-50 ms in parallel.
Failure mode: Residential proxies match the IP geo to the claimed location, defeating naive geo-mismatch checks.

5. Network type

Residential, mobile, datacenter, VPN, Tor — each implies different fraud baseline rates. Mobile IPs are shared and noisy; Tor exits are heavily skewed toward fraud; clean residential is closest to baseline.

Sources: ASN database (Hurricane Electric, MaxMind), proxy-detection databases (IPHub, IPQualityScore).
Latency: Sub-15 ms.
Defense angle: A residential ISP proxy from a real Comcast or AT&T range looks indistinguishable from a real customer at the network-type level — that's why network type is one signal of five, never the only one.

How Web Scraping Powers Fraud Detection

Fraud vendors run their own scraping infrastructure to collect ground-truth data that can't be faked:

Marketplace listings — what's a real product price vs a fraud listing's suspiciously low price? Scraped daily across Amazon, eBay, Walmart, AliExpress.
Review patterns — fake review networks share patterns visible only when you've scraped 100k+ reviews to compare against.
Ad placements — verifying that an ad campaign rendered to real users in claimed geographies (see our ad-verification guide).
Phishing infrastructure — scraping suspected phishing domains to fingerprint kits and detect new variants.
Dark-web forums — pulling sales of stolen credentials to alert affected customers within hours of the leak.

This is why fraud and threat intelligence are two of the largest enterprise residential proxy buyers — see our why companies use residential proxies breakdown.

Reference Architecture (Sub-200 ms Total)

Stage	What it does	Latency budget
1. Ingest	Receive transaction event over HTTPS API	5 ms
2. Enrich (parallel)	IP reputation, device lookup, geo, BIN, ASN — all in parallel	50 ms
3. Rules engine	Hard blocks: sanctions list, known-bad IPs, blacklisted devices	10 ms
4. ML model	Gradient-boosted decision tree or shallow neural net scoring all features	30 ms
5. Decision + log	Return allow / review / block; log for offline retraining	5 ms
Total		~100 ms

The single hardest constraint is parallel enrichment: every external lookup must complete inside the 50 ms window. This drives architectural choices like Redis-cached IP reputation (no network call), in-memory device fingerprint stores, and async I/O for any unavoidable third-party API hits.

Compliance: GDPR + CCPA

Real-time fraud detection involves automated decisioning on personal data, which both GDPR and CCPA regulate:

Lawful basis (GDPR Article 6). Most fraud systems run under "legitimate interest" — the lawful basis that allows you to process personal data for fraud prevention without explicit consent.
Article 22 (automated decisioning). If a fraud decision has "legal or similarly significant effects" — like blocking a payment — the user has the right to human review. Production systems implement this as a manual review queue for ambiguous cases.
Data minimization. Don't store more than required. IP reputation and device fingerprints have legitimate retention periods; full request bodies usually do not.
Right to challenge. Affected users can request explanation of why they were flagged. This drives the rules-first architecture — rules are explainable; ML feature importance is murkier.

For US-only systems, CAN-SPAM and the FTC's broader fraud-prevention authority apply but with fewer process requirements than GDPR.