Data study · June 15, 2026
We probed 9,992,781 of the world’s most popular domains and labelled each one alive, redirect, blocked, or dead. The real dead figure is 14.1% — not the 27.6% a naive crawl reports, because most of “dead” is just blocking you.
14.1%
of the 9,992,781 probed top domains are genuinely dead — gone from DNS or refusing every connection.
8.9%
answer but block bots
1.1%
of responders are parked
Homepage-level reachability from a datacenter IP — a lower bound.
14.1%
of the 9,992,781 probed domains are genuinely dead — no DNS, no connection, nothing answers. That is the real dead-web figure, not the 27.6% a naive crawl reports.
8.9%
answer but block automated clients (403/429/challenge) from a datacenter IP — alive, just not to a bot. Naive scans count these as dead.
10.3%
of all domains no longer resolve in DNS — the dominant cause of true death, 1,027,492 domains gone dark.
33%
.cn is the deadest common TLD — institutional and cheap-registration TLDs rot fastest, well above the .com baseline.
Every probed domain, by outcome
A naive 2024 crawl of the same top-10M list reported 27.6% dead. Probe honestly — separating genuine death from anti-bot blocking and answered errors — and the real figure is 14.1%. Here is where the difference goes.
DNS failure, anti-bot 403s, 404/5xx and timeouts all lumped together
No DNS, connection refused, or nothing accepts a connection
Where the “dead” really goes
The same domains, probed by an honest bot and by a browser-like client (real Chrome TLS/JA3). Where the browser column is lower on dead/blocked, the site is reachable — the bot just wasn't let in.
| Probe arm | Probed | Alive | Blocked | Dead | Dead % |
|---|---|---|---|---|---|
| Polite bot | 9,992,781 | 7,657,422 | 891,517 | 1,412,544 | 14.1% |
| Reachability (browser) | 9,997,315 | 7,743,245 | 819,599 | 1,412,889 | 14.1% |
| China (.cn) | 33% |
| India (.in) | 25.8% |
| United States of America (.us) | 22% |
| Brazil (.br) | 20.9% |
| Spain (.es) | 16.6% |
| Japan (.jp) | 15.6% |
| United Kingdom (.uk) | 15.3% |
| Australia (.au) | 15% |
| Russia (.ru) | 14.8% |
| France (.fr) | 14.5% |
| Canada (.ca) | 14.1% |
| Italy (.it) | 13.5% |
| Poland (.pl) | 13.1% |
| Sweden (.se) | 11.6% |
| Switzerland (.ch) | 9.8% |
| Netherlands (.nl) | 9.7% |
| Austria (.at) | 8.6% |
| Germany (.de) | 7.6% |
| Czechia (.cz) | 7.2% |
The gap between 27.6% and 14.2% is mostly a measurement choice. A crawler that stops at the first response sees only 45.9% return a clean 200; follow the redirects and read the bodies, and 71.9% are alive. Here is where every first response ends up.
| 200 OK → Alive | 4,584,611 (46.3%) |
| 3xx redirect → Alive | 2,677,304 (27%) |
| No response → Dead | 1,413,013 (14.3%) |
| 403 / 429 → Blocked | 410,511 (4.1%) |
| 3xx redirect → Blocked | 365,368 (3.7%) |
| 404 → Alive | 236,685 (2.4%) |
| No response → Blocked | 105,222 (1.1%) |
| 5xx → Alive | 85,728 (0.9%) |
| 3xx redirect → Redirect | 31,267 (0.3%) |
| 3xx redirect → Dead | 1,775 (0%) |
Split the 10 million by popularity and the dead rate climbs more than 20× — from 0.8% in the top 1,000 to 16.1% past rank 5 million — while blocked runs the other way, peaking at the popular head.
99.8% of dead domains sit below rank 100,000. The popular top-100K — where most web traffic lives — is only 2.2% dead, so weighted by attention the dead web nearly disappears:
share of the top 10M that are dead
the popular top-100K is only 2.2% dead
Search, filter by outcome, switch the probe arm, and sort. The full dataset is on GitHub. Click a domain to see how each arm fared.
| Rank | Domain | Outcome | Reason | Status | Final URL |
|---|---|---|---|---|---|
| 1 | www.facebook.com | Alive | ok | 200 | https://www.facebook.com |
| 2 | fonts.googleapis.com | Alive | not_found | 404 | https://fonts.googleapis.com |
| 3 | www.google.com | Alive | ok | 200 | https://www.google.com |
| 4 | www.googletagmanager.com | Alive | not_found | 404 | https://www.googletagmanager.com |
| 5 | www.youtube.com | Alive | ok | 200 | https://www.youtube.com |
| 6 | www.instagram.com | Alive | ok | 200 | https://www.instagram.com |
| 7 | twitter.com | Alive | ok | 200 | https://twitter.com |
| 8 | www.linkedin.com | Alive | ok | 200 | https://www.linkedin.com |
| 9 | fonts.gstatic.com | Alive | not_found | 404 | https://fonts.gstatic.com |
| 10 | gmpg.org | Alive | ok | 200 | http://gmpg.org |
| 11 | ajax.googleapis.com | Alive | ok | 200 | https://developers.google.com/speed/libraries |
| 12 | play.google.com | Alive | ok | 200 | https://play.google.com/store/games |
| 13 | cdnjs.cloudflare.com | Alive | ok | 200 | https://cdnjs.cloudflare.com |
| 14 | x.com | Alive | ok | 200 | https://x.com |
| 15 | maps.google.com | Alive | ok | 200 | https://www.google.com:443/maps |
| 16 | youtu.be | Alive | ok | 200 | https://www.youtube.com/?feature=youtu.be |
| 17 | docs.google.com | Alive | ok | 200 | https://accounts.google.com/v3/signin/identifier?continue=https%3A%2F%2Fdocs.google.com%2F&dsh=S1998365293%3A1781491322151626&emr=1&followup=https%3A%2F%2Fdocs.google.com%2F&osid=1&passive=1209600&flowName=WebLiteSignIn&flowEntry=ServiceLogin&ifkv=AcDsRvxjY2DiTO1LO55T2ftQhVddFjZd9mbP0ifH22dlQQz5lXpKMCeXRAxnhUuyXmdS-JxRXpbP |
| 18 | support.google.com | Alive | ok | 200 | https://support.google.com |
| 19 | github.com | Alive | ok | 200 | https://github.com |
| 20 | instagram.com | Alive | ok | 200 | https://www.instagram.com/ |
| 21 | cdn.jsdelivr.net | Alive | ok | 200 | https://www.jsdelivr.com |
| 22 | developers.google.com | Alive | ok | 200 | https://developers.google.com |
| 23 | drive.google.com | Alive | ok | 200 | https://accounts.google.com/v3/signin/identifier?continue=https%3A%2F%2Fdrive.google.com%2F&dsh=S-1676406091%3A1781515362389664&emr=1&followup=https%3A%2F%2Fdrive.google.com%2F&osid=1&passive=1209600&service=wise&flowName=WebLiteSignIn&flowEntry=ServiceLogin&ifkv=AcDsRvzieavHcDznZtV_nadEVIOW7hwFyhsFyTZGEdZXTzZLeyX1AGPew4tKJ2oR5OGnSRigGecLkg |
| 24 | policies.google.com | Alive | ok | 200 | https://policies.google.com |
| 25 | www.tiktok.com | Alive | ok | 200 | https://www.tiktok.com |
| 26 | wordpress.org | Alive | ok | 200 | https://wordpress.org |
| 27 | en.wikipedia.org | Alive | ok | 200 | https://en.wikipedia.org/wiki/Main_Page |
| 28 | facebook.com | Alive | ok | 200 | https://www.facebook.com/ |
| 29 | bit.ly | Alive | ok | 200 | https://bitly.com/ |
| 30 | creativecommons.org | Alive | ok | 200 | https://creativecommons.org |
| 31 | apps.apple.com | Alive | ok | 200 | https://apps.apple.com/us/iphone/today |
| 32 | www.twitter.com | Alive | ok | 200 | https://twitter.com/ |
| 33 | itunes.apple.com | Alive | ok | 200 | https://www.apple.com/itunes/ |
| 34 | youtube.com | Alive | ok | 200 | https://www.youtube.com/ |
| 35 | www.amazon.com | Alive | ok | 202 | https://www.amazon.com |
| 36 | www.microsoft.com | Alive | ok | 200 | https://www.microsoft.com/en-us |
| 37 | sites.google.com | Alive | ok | 200 | https://accounts.google.com/v3/signin/identifier?continue=https%3A%2F%2Fsites.google.com%2F&dsh=S504210863%3A1781499596710744&followup=https%3A%2F%2Fsites.google.com%2F&osid=1&passive=1209600&service=wise&flowName=WebLiteSignIn&flowEntry=ServiceLogin&ifkv=AcDsRvxlEyRn51akwX3Yo5Z5-yjZ_0TmmrzjGTtdcNZu6h1fCR4c3cySc0HNA1BENslVEA2rqnHe8A |
| 38 | www.pinterest.com | Alive | ok | 200 | https://www.pinterest.com |
| 39 | secure.gravatar.com | Alive | ok | 200 | https://secure.gravatar.com |
| 40 | static.cloudflareinsights.com | Alive | server_error | 522 | https://static.cloudflareinsights.com |
| 41 | accounts.google.com | Alive | ok | 200 | https://accounts.google.com/v3/signin/identifier?continue=https%3A%2F%2Faccounts.google.com%2F&dsh=S504210863%3A1781499430162528&followup=https%3A%2F%2Faccounts.google.com%2F&passive=1209600&flowName=WebLiteSignIn&flowEntry=ServiceLogin&ifkv=AcDsRvwE44MttILTyncrMyfB8W-pDyONGESU6e8EDDVCxCmxurjFPgK24vj_zcckf8eY7ScogzT5 |
| 42 | medium.com | Blocked | forbidden | 403 | https://medium.com |
| 43 | vimeo.com | Alive | ok | 200 | https://vimeo.com |
| 44 | open.spotify.com | Alive | ok | 200 | https://open.spotify.com |
| 45 | goo.gl | Alive | client_error | 400 | https://goo.gl |
| 46 | plus.google.com | Alive | ok | 200 | https://workspaceupdates.googleblog.com/2023/04/new-community-features-for-google-chat-and-an-update-currents%20.html |
| 47 | lh3.googleusercontent.com | Alive | client_error | 400 | https://lh3.googleusercontent.com |
| 48 | www.gstatic.com | Alive | not_found | 404 | https://www.gstatic.com |
| 49 | soundcloud.com | Alive | ok | 200 | https://soundcloud.com |
| 50 | t.me | Alive | ok | 200 | https://telegram.org/ |
We probe a top-popularity domain list HTTPS-first from a datacenter IP, following redirects, and label each domain alive, redirect, blocked, or dead by the evidence the probe captures — a final HTTP status, or a transport error plus whether a raw TCP connect still succeeds. A served 404, a 5xx, or a Cloudflare 52x is alive (the host answered); a 403/429 or anti-bot challenge is blocked; only no DNS, a refused/reset connection, or nothing accepting a connection is dead. Every domain is probed twice — as a polite bot and as a browser-like client (real Chrome TLS/JA3) — and the full per-domain dataset is open.
This measures whether the domain itself still resolves and answers — a different question from Pew Research’s 2024 link-rot study (25% of pages from 2013–2023 are gone; 38% of 2013 pages) and Ahrefs’ link-rot study (66.5% of links have rotted), which measure broken links insideliving pages. It is also not the “dead internet theory” — that is a claim about AI-generated content, not domain reachability.
Cite this
Crawlora (2026). Dead-Web Index 2026. 14.1% of 9,992,781 top domains are genuinely dead; 8.9% answer but block automated clients. https://crawlora.net/dead-web-index.
14.1% of the top 9,992,781 domains are genuinely dead — about 1,412,544 sites that no longer resolve in DNS or refuse every connection. That is far below the often-quoted "27.6% of the web is dead," which counted anti-bot blocks and answered errors as death.
A dead site never answers — no DNS record, or nothing accepts a TCP connection. A blocked site is alive and answering, it just refuses an automated client (a 403, 429, or anti-bot challenge). 8.9% of the top web (891,517 sites) is blocked, not dead — a distinction naive crawlers miss.
No. The dead internet theory is a claim that AI-generated content and bots have replaced human activity on the living web. This index measures the opposite and concrete thing: how many domains have gone completely dark and unreachable — DNS gone, connection refused, server gone.
Earlier top-10M crawls counted three non-dead things as dead: anti-bot 403/429 blocks, 404/5xx pages served by a live server, and domains a single flaky DNS resolver failed to look up. Classifying honestly — dead means genuinely unreachable — brings the real figure to 14.1%.
.cn has the highest death rate among common TLDs at 33%. Institutional TLDs like .gov and .edu also rank high — matching Pew Research's finding that government and reference pages suffer the worst link rot.
Anti-bot systems (Cloudflare, DataDome, and others) serve a 403 or a challenge to a datacenter IP while letting a real browser through. A matched browser TLS/JA3 fingerprint reaches the site where a naive bot is blocked — which is exactly why this index probes every domain twice, as a polite bot and as a browser-like client.
8.9% of the top web answers but blocks a naive bot. Crawlora escalates from a plain request to a real browser fingerprint only as far as a site demands, and bills on success — so you reach the live web that the 14.1% genuine dead doesn’t include.