Indexed Though Blocked by robots.txt: What It Means + How to Fix It
“Indexed, though blocked by robots.txt” in Google Search Console means Google indexed your page without being able to crawl it. Here's why it happens — and how to fix it.
“Indexed, though blocked by robots.txt” means Google has the URL in its index but your robots.txt won't let it crawl the content — so it ranks with no real snippet. robots.txt blocks crawling, not indexing. If the page should be indexed, remove the Disallow. If it shouldn't, robots.txt is the wrong tool: allow crawling and add a noindex tag instead.
The contradiction in this status dissolves once you accept one fact: robots.txt controls crawling, not indexing. Blocking a URL in robots.txt tells Google don't fetch this — it does not say don't list this. So Google can, and sometimes does, put the URL in the index on outside signals alone (most often inbound links from other sites), having never read a word of the page.
The result is the worst of both worlds. The URL shows up in search, but with no title, a generic or stale description, or the giveaway snippet “No information is available for this page.” Google is ranking a page it can't see.
So the real question isn't how do I make this go away — it's did I want this page indexed or not? The two answers need opposite fixes, and picking the wrong one is how most people get stuck here.
First, confirm what Google can and can't do
Open Pages → Why pages aren't indexed, click “Indexed, though blocked by robots.txt,” and pull one URL into URL Inspection. Two fields tell you everything:
- Crawl allowed: No — confirms a robots.txt rule is blocking the fetch.
- Page fetch: Failed (blocked) alongside Indexing: URL is on Google — confirms the page is indexed despite that block.
If Crawl allowed says Yes, you're not looking at this status — recheck the report. While you're in URL Inspection, note the robots.txt rule GSC names as the blocker; you'll need to find that exact Disallow: line.
This status almost always means another site linked to your page. Google found the URL through that link, couldn't crawl it, and indexed the bare URL anyway. That's why you can't make it disappear just by tightening robots.txt — the block is exactly what's causing the problem.
Decide which page this is
Everything downstream depends on this one call. Run the affected URL against both — only one is yours.
This page should be indexed
You blocked it by accident — a leftover Disallow: from a migration, an overly broad rule like Disallow: /products/ that swept up pages you want, or a staging rule that shipped to production.
Tell: it's a real page you'd be happy to see rank — a product, an article, a category — and you can't remember deliberately wanting it out of Google.
This page should not be indexed
It's a thank-you page, a faceted-filter URL, an internal search result, a PDF you'd rather hide — and you reached for robots.txt to keep it out of Google. That's the wrong tool: robots.txt stops the crawl but not the listing, which is precisely why it's stuck here.
Tell: you intended to keep this URL out of search, and robots.txt was your method.
Fix it for the page you have
Match the path to the call you just made — you only need one.
- Should be indexed: remove the Disallow, then re-crawl
Open
https://yourdomain.com/robots.txtand find the rule URL Inspection named — for exampleDisallow: /products/. Delete it, or narrow it so it no longer matches the URL you want crawled. Order and specificity matter: a broadDisallow:can override a laterAllow:, so test the exact path in GSC's robots.txt report after editing.Then re-run URL Inspection, confirm Crawl allowed: Yes, and click Request indexing. Google re-crawls, reads the real content, and rebuilds the index entry with a proper title and snippet.
- Should NOT be indexed: allow crawling, then noindex
This is the counterintuitive part. To get a page out of the index, you have to let Google in. The deindex signal is a noindex tag — and Google can only see it by crawling the page.
First, remove the
Disallow:rule from robots.txt so the page is crawlable. Then add a noindex directive to the page itself:<meta name="robots" content="noindex" />Now Google can fetch the page, read the noindex, and drop it from the index — typically within days to a few weeks. Once it's gone, you can re-block in robots.txt, but you rarely need to; the noindex does the job on its own. See Excluded by noindex tag for what the clean end-state looks like.
- Then pull it out of your sitemap
For the deindex path: if the URL is in your XML sitemap, remove it. A sitemap entry tells Google this URL matters — the opposite of the noindex signal you're sending. Leaving it in keeps re-suggesting the page Google is trying to forget.
Adding a noindex tag while the page is still blocked in robots.txt does nothing. Google can't crawl the page, so it never sees the noindex — the URL stays indexed indefinitely. noindex and a robots.txt block are mutually defeating: pick one, and for deindexing it must be noindex with crawling allowed.
Confirm the fix landed
Give it a few days to a few weeks, then verify against what you were going for:
- If you wanted it indexed — re-inspect in URL Inspection for Crawl allowed: Yes, then watch the Performance report for that page; impressions and a real snippet appearing is the proof it's indexed and readable.
- If you wanted it deindexed — re-inspect and confirm Google processed the noindex; the URL should move to Excluded by ‘noindex’ tag in the Pages report and fall out of
site:yourdomain.com/your-page-urlresults.
If a page you wanted gone is still listed after a couple of weeks, the usual cause is that the block went back on too soon — Google never got a clean crawl with the noindex visible. Lift the block, leave it lifted until the page clears, and re-check.
Don't confuse it with these neighbors
| Status | What it really means | Fix lives at |
|---|---|---|
| Indexed, though blocked by robots.txt | Google indexed the URL from outside signals but can't crawl the content | This page |
| Blocked by robots.txt | Crawling is blocked and the URL usually isn't indexed at all | Blocked by robots.txt guide |
| Excluded by ‘noindex’ tag | You deliberately deindexed it with a noindex Google was able to read | Excluded by noindex guide |
| Duplicate without user-selected canonical | Google picked a different version of the page to index | Duplicate canonical guide |
The line that sets this apart: it's an indexed page, and the block is the cause of the problem — not the solution to it.
The hard part isn't editing robots.txt — it's spotting these on a large site, where a single broad Disallow: can strand dozens of URLs in this bucket, mixing pages you want indexed with ones you don't.
TurboConsole reads your Search Console data, flags every page indexed-but-blocked, names the exact robots.txt rule behind it, and tells you per page whether to unblock or deindex — so you fix the right ones the right way instead of inspecting URLs one at a time.
Frequently asked
How can a page be both indexed and blocked from crawling?
Will using both a robots.txt block and a noindex tag deindex the page?
How long until Google removes a page from the index after I add noindex?
How is this different from “Blocked by robots.txt”?
We surface these issues automatically.
Connect Search Console once. Every issue like this gets ranked by impact, with a fix you can ship today.
Related issues
- Blocked by robots.txt: What It Means + How to Fix It“Blocked by robots.txt” in Google Search Console means your robots.txt is telling Google not to crawl this page. Here's when it's intentional, when it's a problem, and exactly how to fix it.
- Excluded by 'noindex' Tag: What It Means + How to Fix It“Excluded by 'noindex' tag” in Google Search Console means a noindex directive is keeping your page out of the index. Here's when that's correct, when it's a mistake, and exactly how to fix it.
- Duplicate Without User-Selected Canonical: What It Means + How to Fix It“Duplicate without user-selected canonical” in Google Search Console means Google found duplicate pages and picked one for you. Here's why — and exactly how to fix it.