Google corrected a user agent error in the crawler documentation that might have caused a misidentification of the crawler.
Google has corrected an error that unintentionally misrepresented one of its crawlers in the documentation for their crawlers.
In general, this is a small problem, but for SEOs and publishers that rely on the instructions to build firewall rules, it’s a huge one.
It’s possible for a website to unintentionally stop a valid Google crawler if the proper information is not recorded.
Inspection Tool for Google
The Google Inspection Tool part of the documentation contains the error.
In response to two requests, this significant crawler is dispatched to a website.
1. URL inspection functionality in Search Console
When a user wants to check within search console whether a webpage is indexed or to request indexing, Google’s system responds with the Google Inspection Tool crawler.
The URL inspection tool offers the following functionality:
- See the status of a URL in the Google index
- Inspect a live URL
- Request indexing for a URL
- View a rendered version of the page
- Troubleshoot a missing page
- Learn your canonical page
Test for rich findings 2.
This test determines whether structured data is accurate and meets the requirements for rich results, which are enhanced search results.
By running this test, a certain crawler will retrieve the webpage and examine the structured data.
Why Crawler User Agent Typo Error is Problematic
For websites that have a paywall yet whitelist particular robots, like the Google-InspectionTool user agent, this can become a problematic issue.
If the CMS must employ robots.txt or a robots meta directive to prevent Google from finding pages it shouldn’t be looking at, incorrect user agent identification can also be troublesome.
To prevent bots from indexing sites like the user registration page, user profiles, and the search function, some forum content management systems delete links to such areas of the website.
Hard To Spot User Agent Typo
The issue involved a difficult to catch typo in the user agent description.
See if you can tell the difference?
This is the answer:
Mozilla/5.0 (compatible; Google-Inspect
Mozilla/5.0 (compatible; Google-InspectionTool/1.0;)
If you or a customer are whitelisting Google’s crawlers or restricting crawlers from particular webpages, be sure to update pertinent robots.txt, meta robots directives, or CMS code.
Compare the modified version here to the original version (on the Internet Archive Wayback Machine).
Despite being a minor element, it has a significant impact.