You publish a page. The content is solid, the design is clean, and everything is live. A week passes. You search Google and find nothing. The page doesn’t exist as far as the index is concerned.
What Google Actually Has to Do Before Ranking Anything
Before a page can appear in search results, Google has to complete four distinct steps in sequence: discover the page, crawl it, understand its content, and index it. Miss any one of those steps - through a configuration error, a missing signal, or a rendering problem - and the page simply won’t show up, regardless of how well-written it is.
Google Search Console will usually tell you what went wrong. The messages that send developers searching for answers include “Discovered – currently not indexed,” “Crawled – currently not indexed,” “Excluded by ‘noindex’ tag,” and “Duplicate without user-selected canonical.” Each one points to a specific failure in that four-step sequence.
The failure is almost never about content quality. It’s technical. And every one of these problems has a direct fix.
Working through them systematically - starting with the most destructive mistakes and moving toward subtler issues - is the fastest path from invisible to indexed.
The Eight Places Indexing Breaks Down
robots.txt blocking your own site
This is the most disruptive mistake, and it happens more often than it should. A developer adds a blanket disallow rule during staging to keep the site out of search results while it’s being built. The rule looks like this:
User-agent: *
Disallow: /
Then the site ships to production with that rule intact. Google sees it, respects it, and crawls nothing. The fix is straightforward - update robots.txt to allow crawling and add a sitemap reference:
User-agent: *
Allow: /
Sitemap: https://example.com/sitemap.xml
After making the change, use the robots.txt Tester inside Google Search Console to confirm the rule is working as intended before assuming the problem is resolved.
Accidental noindex tags
A noindex meta tag is an explicit instruction to Google: do not include this page in search results. It’s the right tool for staging environments and genuinely private pages. Left in production, it silently removes pages from the index.
<meta name="robots" content="noindex">
Search your codebase for every instance of that tag. Remove it from any page you want Google to index, then use Search Console to request reindexing. If your CMS or framework applies SEO settings on a per-page basis, check what the default value is for new pages - a misconfigured default can quietly noindex every new page you publish.
No XML sitemap
Google discovers pages by following links, but a sitemap is a direct signal about what exists on your site. For new pages and pages that aren’t linked from many other places, a sitemap significantly accelerates discovery. Without one, Googlebot may simply never find those pages within a useful timeframe.
For Next.js projects, the next-sitemap package automates this. Install it with npm install next-sitemap, add a next-sitemap.config.js file, run it as part of your post-build process, and submit the output file to Google Search Console under the Sitemaps section.
Weak internal linking
Even a page that appears in your sitemap can go undiscovered if nothing on your site actually links to it. Googlebot follows links. A page with no internal links pointing to it is effectively isolated - it may sit in your sitemap indefinitely without being crawled.
This issue frequently affects blog posts, landing pages, and documentation pages that get created and then left without connections to the rest of the site. The fix is to add links from pages that are already indexed and receive meaningful traffic: navigation menus, category pages, related article sections, and homepage feature blocks are all effective entry points. Internal linking improves both discoverability and the authority passed between pages.
Duplicate content across URL variants
Google avoids indexing multiple versions of the same content. The problem is that what looks like one page to you can look like four separate pages to Googlebot:
/page
/page/
/page?utm_source=google
/page?ref=campaign
Each of those URLs can be treated as a distinct page competing with the others. The fix is a canonical tag that declares which version is authoritative:
<link rel="canonical" href="https://example.com/page" />
Most frameworks and CMS platforms have built-in canonical support. The task is making sure it’s configured correctly and consistently across the site, not just on the pages you remember to check.
Thin or low-value content
Google actively filters pages that provide little value from its index. Empty category pages, auto-generated content, placeholder articles, and very short pages with no real substance are all at risk. This isn’t about word count - it’s about whether a page actually does something useful for the person reading it.
Content worth indexing solves a specific problem, answers a question clearly, or offers something a reader can’t get from a hundred other pages on the same topic. Content quality remains one of the stronger signals Google uses when deciding what earns a place in the index.
JavaScript rendering gaps
Modern frontend frameworks - React, Vue, Angular, Svelte - commonly load content entirely through JavaScript after the initial page load. If the content a user sees isn’t present in the initial HTML response, Googlebot may never see it either.
The pattern that causes the problem looks like this:
useEffect(() => {
fetchData();
}, []);
Content loaded this way is invisible until JavaScript executes. Googlebot does eventually render JavaScript, but the process is delayed and unreliable for critical content. The fix is to move that content into the initial HTML response using either Server-Side Rendering, where content is rendered per request, or Static Site Generation, where content is rendered at build time. Next.js, Nuxt, Astro, and SvelteKit all support both approaches.
Using Search Console to Confirm What’s Actually Happening
Fixing a suspected issue without verifying it in Search Console is guesswork. The URL Inspection tool shows exactly how Google last crawled a specific page - what it saw in the HTML, whether the page was indexed, and what status was assigned. After making any of the fixes above, use the tool to request reindexing rather than waiting for Googlebot to return on its own schedule.
The Coverage report shows index status across your entire site, grouped by the same status messages mentioned earlier. Patterns in that report - a large number of pages marked “Discovered – currently not indexed,” for example - often point to a systemic issue rather than a page-by-page problem, which changes where you look first.
Reindexing requests through Search Console don’t guarantee an immediate result. For a typical page, the process takes anywhere from a few days to a few weeks depending on the site’s crawl budget and how recently Google last visited.