How To Use The Google Indexation Checker
This is a step-by-step guide on how to use the Google Indexation Checker feature in URL Profiler, and how to interpret the results.
To get a more complete overview of how this function works, I also recommend you read the accompanying blog post, which introduces the Google Index Checker feature.
How To Set Up Indexation Checks
The most common use-case for this feature is to profile all the URLs on a single site, as part of a technical site audit.
We could import a Screaming Frog crawl, but for the purpose of this example we’ll just import our sitemap. Then select ‘Google Indexation’ under the ‘Google’ option.
If you’ve not added proxies to URL Profiler before, the first time you select this option you will be shown a warning:
Proxies are required because this feature automatically queries Google in bulk – and Google DO NOT like you doing this! (there is no other way to check URL level indexation)
We recommend a provider called BuyProxies.org, and we suggest you use their Dedicated Proxies.
I have written a full, detailed guide which explains exactly how to use proxies with URL Profiler and how to get set up with BuyProxies.org.
More Proxies = Faster Results
If speed is your priority, more proxies will get the job done faster for you. You can in theory use as many proxies as you want, but we wouldn’t recommend using less than 10.
These are not hard and fast rules, as they also depend upon the speed and reliability of your proxies. If you start to see results slow down dramatically, you might need to check your proxies are still working ok.
The table below will give you an idea of how to work out what you may need. Again, please see the proxy guide post for more detail on this.
|No. Proxies||Checking Speed||1000 URLs Will Take||Suggested Max*|
|10||1 every 4 seconds||Approx 70 minutes||1,250 URLs|
|20||1 every 3 seconds||Approx 50 minutes||2,500 URLs|
|50||1 every 2 seconds||Approx 35 minutes||6,500 URLs|
*Suggested maximum per profile
Once you have your URLs in, and some proxies loaded, you are ready to go! Just hit ‘Run Profiler’ and wait for the program to complete.
Interpreting The Results
Again, I strongly suggest you read the accompanying blog post, so that you understand where we’re coming from with our results.
The index checker will return 5 columns of data, as follows:
- Google Indexed: Can we find the URL in the base index? Result is either, Yes, No or Alternative URL.
- Google Info: Indexed: We only check this if the URL is not in the base index (i.e. did not get a ‘Yes’ in the first column). Result is either, Yes, No, Not Checked or Alternative URL.
- Google Index: Based on the checks above, we determine which index the URL is in. Result is either Base, Deep, or None.
- Google Indexed Alternative URL: If we found an alternative URL indexed instead of the one we searched for, we display this here.
- Google Cache Date: Simply displays the last cache date for each URL. If there is no cache date, the result is listed as ‘Not Cached’. On occasion we are unable to check the cache date, in which case the message ‘Check Failed’ is displayed instead.
This doesn’t really make a lot of sense without examples, so I will give examples below for each of the logical options.
URL in Base Index
This is the most ‘normal’ result. The URL is properly indexed in the Base index, and it returns as the top result when you search Google for the exact URL.
URL in Deep Index
This is probably the most abnormal result, and the one we discovered when testing the index checker in the first place. It is possible for URLs to return when queried using the info: operator, but not when searched more generally, and don’t return at all when you search Google for the exact URL.
As far as we are able to determine, URLs in the ‘Deep’ index are not findable under normal circumstances, and therefore not in the Base index from which Google serve their results (i.e. these URLs will never get you any traffic)
URL Not Indexed
A very useful result, but not one you want to see a lot of I’d guess. This means we couldn’t find the URL in the Base Index or the Deep Index – it is not indexed at all.
Alternative URL Indexed
So far we have seen the ‘Alternative URL’ column empty. This comes into play when we process a info: command and get given a different URL to the one we specified. We say that the URL is ‘None’ for the Google Index column as the specific URL you requested is not actually indexed.
The 4th column ‘Google Indexed Alternative URL’ is where we specify what the different URL is that Google returned.
Typically this happens for canonical URLs:
The final column we return is ‘Cache Date’. It is not on the screenshots above as I wanted to make sure the distinction was clear between indexing and caching, as they are not the same thing.
Cache Date is the last date Google cached your page, or in cases where your page has not actually changed, it is the date that Google last requested your page for crawling.
Typically these results will look like this:
The data is pretty straightforward, and can generally be considered as a good proxy for ‘last crawl date’.
There are also some less straightforward results that this check generates:
This simply means we were unable to find the cache link, meaning it is not cached by Google at all.
You can see this in the SERPs by the absence of the green dropdown link.
Of all the results we have tested, ‘Not Cached’ seems to have given us the most false positives (still single digit % though) – so if you are concerned, it might be worth re-running your ‘Not Cached’ results.
No Date Found
This is a bit more unusual, and it represents URLs that are cached, but for some reason Google are not displaying a cache date.
In this example, the page redirects via 301 to a downloadable PDF. The cached content is a HTML version of the document, but they offer no cache date.
This is different to both of the above cases, and represents URLs that serve a 404 when we request the webcache:
When Gareth told me about this, his words were (quite literally):
“It’s Google fucking with us.”
Whether this is the case or not, we are unable to get the results.
The Most Important KPI for SEO
The reasons for checking index status are simple – if your pages are not indexed they can not generate organic search traffic.
Further, if your pages are only indexed in the ‘Deep’ index, they can not generate organic search traffic.
Often, you’ll be looking for the inverse – you will have pages you don’t want indexed and you’ll want to make sure that they’re not.
Either way, thoroughly checking indexation can be one of the most important stages in a technical site audit.
URL Profiler’s index checker will allow you to do this more accurately and more thoroughly than any other SEO tool on the market.
If You Like the Sound of URL Profiler,
Download a Free Trial Today
(You'll be amazed by how much time it saves you, every day!)