Google Webmasters Help FAQ

Weblog for the Google Webmasters Forum

W3C Validation and Crawlability

Posted by John Honeck on March 9th, 2007

It would be foolish to state the good coding practices should not be followed for any web design effort, however as a first post in this category I wanted to temper any future claims this ascertainment from soon-to-be-famous Googler Adam Lasnik who said this on 12/21/06.

Our Googlebot is amazingly persistent and resourceful and is given antacids each day before he crawls.
Seriously… I don’t want to discourage anyone from validating their site; however, unless it’s REALLY broken, we’re likely going to be able to spider it pretty decently.
It’s more important — from a Google-friendly site perspective — that your site adheres to our guidelines and is broadly accessible (serverwise, browserwise, platformwise, etc.)
Being more specific: I’m betting that in the vast majority of cases in which folks have indexing or ranking concerns, the core issue is NOT that their site doesn’t perfectly validate.

I think the most important thing to note from this is that Adam clearly states that Googlebot has the ability to crawl broken code, which empirically makes sense as I would venture to guess the majority off all pages on the web are broken is some way.

I would like to further explore known and proven examples of broken elements that will indeed STOP googlebot from crawling the page, caching the page, and indexing the page.

Leave a Reply

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>