The SEO Fix is a blog dedicated to helping people with their search engine optimization, linking out to the most relevant content for each query. One day last year, they received an unusual request from one of their readers: how could they help fix a crawler that was currently not indexing pages? They found the issue and fixed it!
The “how to fix crawled – currently not indexed” is a problem that can occur when an SEO wants to add a new page or change the URL of their website. They will need to use the “manual crawl” option in order to get their website crawled and indexed by Google.
A technical SEO wrote a case study on how he fixed a strange Crawled Currently Not Indexed issue on his website. While the answer he discovered may not be applicable to everyone who is having this issue, his process for identifying and addressing it provides a valuable tutorial for resolving technical SEO issues.
It was strange what occurred to his site indexing. His remedy, on the other hand, was simple and logical.
On Adam Gent’s (@Adoubleagent) Twitter account, I found an explanation of the situation.
I wrote a quick blog article regarding a technical SEO problem I was having with my little website.
Canonicalization in a Strange Case –> https://t.co/pC2QAYLjq9
TL;DR – Google’s canonicalization may go horribly wrong, affecting SEO traffic.
November 3, 2021 — Adam Gent (@Adoubleagent)
Advertisement
Continue reading below for more information.
Crawled but not yet indexed
Many anecdotal stories of Crawled Currently Not Indexed have surfaced on Facebook, Twitter, and even in John Mueller’s after-hours hangouts.
Someone inquired in a recent Office-hours chat why Google Search Console (GSC) showed Crawled Not Indexed yet they were really indexed when you clicked through. It’s only a gap in reports, according to John Mueller.
In another Office-hours meeting, John Mueller said that it’s very usual for a site to have a large number of pages that aren’t indexed.
He noted:
“…if you have a smaller site and a large portion of your pages aren’t being indexed, I’d take a step back and assess the overall quality of the website rather than focusing on technical difficulties for those pages.”
Another thing to bear in mind about indexing is that it’s very usual for us not to index everything on the page.
And as time goes on, when you have 200 pages on your website and we index 180 of them, that proportion grows lower.”
Advertisement
Continue reading below for more information.
While all of these are plausible explanations for why some users are experiencing the Crawled Not Indexed problem, Adam Gent uncovered a different cause.
Adam Gent noticed a completely distinct issue that looked to be a Google algorithm issue. The issue was not with the site itself, but rather with Google’s indexing.
Why Was It Crawled – It Wasn’t Currently Indexed
When Adam looked at the GSC Index Coverage report, he saw that Google was scanning and indexing his feeds like HTML pages.
He used random phrases from those sites in a site: search and determined that the content of the feed page was really indexed.
To make things worse, Google seems to have canonicalized the RSS feed content over the actual web page, which explains why the genuine web sites were crawled but not indexed.
WordPress was used to create the RSS feed.
One peculiar feature of this scenario is that the feed page seems to be a web page rather than an XML file.
Screenshot of RSS Feed Cache
I might be mistaken, but it does not seem to be a standard RSS stream. It seems to be an HTML page.
Advertisement
Continue reading below for more information.
Even though the underlying code is XML, most feeds don’t appear like that.
Could this have influenced Google’s decision to canonicalize the feed?
It’s difficult to see how this might happen given there are several signs, such as internal linking, that would normally encourage Google to prioritize HTML sites as canonical.
How Adam Resolved the Issue
After Adam found out what had occurred, he deleted the WordPress-generated feed pages, crawled the feed URLs, and then 404’d the sites.
After those sites were removed from the index, he provided the right URLs to Google, and the issue was resolved within a few days.
Advertisement
Continue reading below for more information.
What was the source of the issue?
According to Adam, the issue seems to be on Google’s end.
I checked around, and someone said that Google had begun indexing feeds a few years ago, but that he felt the issue had been resolved.
I’m not a specialist in XML, but it strikes me as odd that the feed looks like an HTML page rather than the standard XML layout that appears without HTML style.
The stream does not seem to be normal, thus whatever is causing it to appear that way might be the underlying reason.
Regardless, if you’re experiencing Crawled Currently Not Indexed issues, this is another thing to look into in case it’s affecting you as well.
Advertisement
Continue reading below for more information.
Citation
Read the original article for a step-by-step guide to resolving the issue:
Canonicalization in a Strange Case
The “discovered – currently not indexed reddit” is a subreddit that has been experiencing a weird crawled issue. The SEO fixed the issue by following some simple steps.
Frequently Asked Questions
How do I fix crawled currently not indexed?
A: The only way to fix this issue is for you to add your website or blog in the robots.txt file on your site, so that search engines are able to index it properly again.
Why is crawling currently not indexed?
A: I am not a crawler.
What does it mean if a page is not indexed?
A: If your pages are not indexed, then search engines cannot find them. This means its harder for users to find what theyre looking for because major websites will be unable to tell if a website is good or bad based on the number of backlinks it has.
Related Tags
- discovered – currently not indexed
- crawled – currently not indexed status: excluded
- sitemap submitted but not indexed
- shopify discovered – currently not indexed
- how to fix google search console errors