How to find and fix orphan pages
An orphan page is any page on your site that no internal link points to. Google can still find it through your sitemap or an external backlink, but inside your own site it floats with zero internal authority. The page exists, it just does not benefit from anything else you have built. On a 40-page niche site this might be one forgotten review. On a 2,000-page programmatic SaaS or affiliate site, orphans quietly pile up into the hundreds, and they are almost always the pages you most want to rank.
There is a quieter, more common problem too: the under-linked page. It is not technically orphaned (one or two links reach it), but it is starved compared to its importance. A product comparison that took you three days to write and earns affiliate commissions, sitting with a single link from a 2022 blog post, is leaking money the same way a true orphan does. This guide covers how to detect both, why crawlers disagree with each other about what counts as an orphan, and the exact workflow I use to wire these pages back into a site so they actually start moving.
- An orphan page has zero internal links pointing to it; an under-linked page has too few relative to its commercial value. Both bleed authority, but orphans are easier to miss.
- Crawlers and Google Search Console disagree on purpose: a crawler can only follow internal links, so anything it finds in your sitemap but never reaches via a link is, by definition, an orphan candidate.
- The fix is not just adding any link. Link from topically relevant, high-traffic pages using descriptive anchor text, and place the link in the body, not the footer.
- Run an orphan audit on a fixed cadence (monthly for active sites). New orphans appear every time you publish without wiring the page in or change a navigation template.
- Measure the result: track impressions and crawl frequency in GSC for the fixed pages. A page that was never crawled getting indexed within two weeks is the signal you want.
On this page
What an orphan page actually costs you
PageRank, or whatever Google calls its internal flavor of it now, still flows through links. When a page receives no internal links, it inherits none of the authority your homepage and top posts have accumulated. That has three concrete consequences.
First, crawl and indexing problems. Googlebot prioritizes crawling based partly on internal link signals. A page that nothing links to gets crawled rarely, sometimes not at all if it is only in your sitemap. I have seen affiliate pages sit unindexed for months purely because they were orphaned, then get indexed within ten days of adding three internal links.
Second, ranking ceiling. Even when an orphan is indexed, it competes with one hand tied behind its back. It has the on-page content but none of the internal authority signals that tell Google the page matters to you. Two near-identical pages, one well-linked and one orphaned, will not rank the same.
Third, wasted external links. This one stings. If you earned or bought a backlink to your homepage and the page you actually want to rank is orphaned, that external authority has no internal path to flow toward your money page. You paid for juice that pools at the front door. This is exactly why internal linking and link acquisition are two halves of one system, a point I make in detail in the internal linking strategy that actually moves rankings.
Orphan vs under-linked: the working definition
Why crawlers disagree about what counts as an orphan
This trips people up constantly, so it is worth being precise. A site crawler (Screaming Frog, Sitebulb, Ahrefs Site Audit) starts at a URL and follows internal links outward, exactly like Googlebot. By definition, a crawler cannot discover an orphan page through crawling alone, because there is no link to follow. So how does a tool report orphans at all?
It cross-references. The crawler builds a list of every URL it reached by following links, then compares that against a second list of URLs it knows should exist: your XML sitemap, Google Analytics or Search Console data, and server log files. Any URL that appears in those external sources but was never reached by the crawl is flagged as an orphan. This is why your orphan count changes depending on which data sources you connect.
| Data source | What it reveals | Catch |
|---|---|---|
| XML sitemap | Pages you intend to be indexed but did not link to | Only as good as your sitemap; programmatic sitemaps often list everything |
| Google Search Console | Pages Google has impressions/clicks for but you do not link to | Best signal for real, traffic-earning orphans |
| Google Analytics | Pages that got visits but receive no internal links | Misses pages with zero traffic (often the worst orphans) |
| Server log files | Pages Googlebot actually crawled vs your link graph | Most accurate, hardest to set up |
Do not trust a single tool's orphan count
How to detect orphan and under-linked pages
Here is the workflow I actually run. It takes about 30 minutes the first time on a mid-sized site.
Step 1: Crawl with your sitemap connected
In Screaming Frog, go to Configuration > Spider > Crawl and enable XML Sitemap crawling, then add your sitemap URL. Run the crawl. When it finishes, the Reports > Orphan Pages export gives you every URL in the sitemap that the crawl never reached via an internal link. Sitebulb and Ahrefs Site Audit have equivalent reports. If you are choosing between crawlers, I compared the practical differences in the best backlink and SEO tools roundup.
Step 2: Layer in Search Console data
Connect the crawler to GSC, or export your GSC Pages report manually and cross-reference. The pages that have impressions but no internal links are your highest-priority orphans, because Google already considers them relevant enough to show. They are one good internal link away from climbing. Many of these overlap with your striking-distance keywords, the queries sitting on page two that a small authority nudge can push onto page one.
Step 3: Find under-linked pages with inlink counts
True orphans are the obvious case. The bigger opportunity is usually under-linked pages. In Screaming Frog, the Inlinks column shows how many internal links point to each URL. Sort it ascending. Anything with one or two inlinks that is commercially important (a money page, a pillar article, a key category) is under-linked. I build a quick spreadsheet: URL, inlink count, and a manual importance score from 1 to 3. Pages scoring high on importance and low on inlinks go to the top of the fix list.
How to fix them: wiring pages back into your authority
Detection is the easy part. Fixing well is where most people fall short, because they add a single link from a random low-traffic post and call it done. That barely moves the needle. A good fix follows three rules.
- Link from relevant, high-authority pages. A link from your most-trafficked, topically related article passes far more value than three links from thin posts. Identify the orphan's topic, then find your 2-3 strongest pages on that topic and add a contextual link from each.
- Use descriptive anchor text, in the body. The anchor should describe the destination naturally ("our review of the best standing desks" not "click here"). Place it in the running text where it makes editorial sense, not jammed into a footer. Keep your anchors varied and natural so you do not over-optimize, which is its own risk covered in our piece on anchor text ratios.
- Add the orphan to relevant hub or pillar pages. If the orphan belongs to a topic cluster, link it from the cluster's pillar page and from sibling pages in the cluster. This is how you build a tight internal mesh instead of a hub-and-spoke that leaves leaves dangling.
For a real example: an affiliate site I worked on had a high-converting "best X for beginners" page sitting with one inlink. We added body links from the category pillar, from two adjacent comparison posts, and from the most popular how-to guide on the same topic. Four contextual links total. Within three weeks the page moved from position 14 to position 6 for its primary query. No new content, no new backlinks. Just internal authority finally reaching a page that deserved it.
When you should NOT fix an orphan (delete or noindex instead)
Not every orphan deserves rescuing. Some pages are orphaned because they should be. Be honest about this, because adding internal links to genuinely low-value pages dilutes the authority you could send to pages that matter.
| Orphan type | Action |
|---|---|
| Thin, outdated, or duplicate content | Delete and 301 redirect to the best equivalent, or merge |
| Utility pages (thank-you, login, internal search results) | Leave orphaned and noindex; they should not rank |
| Expired products / seasonal pages | Redirect or noindex depending on whether they will return |
| Genuinely valuable page that was simply forgotten | Fix it: add contextual internal links per the workflow above |
Orphaned pages can be a thin-content liability
Make orphan audits a habit, not a one-off
Orphans are not a problem you solve once. They regenerate. Every time you publish a post and forget to wire it in, every time you change a navigation template or remove a category, new orphans appear. On an actively published site I run a crawl monthly. On a stable site, quarterly is enough.
The cleanest prevention is a publishing checklist: before any new page goes live, add at least three contextual internal links to it from relevant existing pages, and add it to its cluster's pillar. If you do that consistently, your orphan count stays near zero and your authority compounds instead of leaking. Pair that internal discipline with deliberate external authority and the whole system multiplies.
Fixing orphans is one of the highest-ROI things in SEO precisely because it requires no new content and no new backlinks. You are unlocking value you already created and paid for. Find the forgotten pages, link them properly, prune the ones that deserve to stay buried, and measure the result in Search Console. The pages were always there. They just needed a way back in.
Frequently asked questions
Do orphan pages hurt my whole site or just that page?
Mostly that page, but there can be a wider effect. The orphaned page itself struggles to get crawled, indexed, and ranked. However, a large volume of thin, forgotten orphan pages can drag on your site's overall quality signals, since Google assesses sites holistically. Valuable orphans should be fixed; thin ones should be pruned or noindexed rather than linked.
How many internal links does a page actually need?
There is no magic number, and it scales with importance and site size. As a practical floor, aim for at least three contextual in-body links from topically relevant pages for any page you want to rank. Your most important money and pillar pages should have far more. What matters more than raw count is that the links come from relevant, reasonably authoritative pages and sit in the body content, not the footer.
Why does my sitemap include pages that crawlers call orphans?
Because the two systems work differently. Your XML sitemap is just a list of URLs you want indexed; it does not require those pages to be internally linked. A crawler discovers pages by following links. When a URL is in your sitemap but the crawler never reaches it via a link, it is flagged as an orphan. That gap is exactly the problem to fix: being in the sitemap is not a substitute for internal links.
Will fixing orphan pages help if they already get some traffic?
Often dramatically, yes. A page already earning impressions in Search Console has proven Google considers it relevant. Such pages are usually one or two good internal links away from a real ranking jump, because you are adding the internal authority signal that was missing. These traffic-earning orphans are the highest-priority fixes on most sites.
Can I just add all my orphans to the footer to fix them fast?
No, that is a false fix. Footer and sidebar links are weighted low by search engines and carry no contextual relevance, so they pass little authority. A page whose only links are in the footer is functionally still orphaned. Always add contextual links from within the body content of relevant pages instead.