Google indexing GET forms?

Posted August 3 by Dan Cryer

I was looking at my server logs this evening, and noticed the following:

66.249.66.35 www.cheaphotels-bristol.co.uk – [03/Aug/2009:23:21:23 +0000] “GET /?hotel_min_price=80&hotel_max_price=40&hotel_order=name_asc HTTP/1.1″ 200 4998 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
66.249.66.35 www.cheaphotels-bristol.co.uk – [03/Aug/2009:23:22:53 +0000] “GET /?hotel_min_price=60&hotel_max_price=200&hotel_order=distance HTTP/1.1″ 200 58235 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
66.249.66.35 www.cheaphotels-bristol.co.uk – [03/Aug/2009:23:24:23 +0000] “GET /?hotel_min_price=60&hotel_max_price=150&hotel_order=price_asc HTTP/1.1″ 200 56684 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
66.249.66.35 www.cheaphotels-bristol.co.uk – [03/Aug/2009:23:27:22 +0000] “GET /?hotel_min_price=100&hotel_max_price=100&hotel_order=name_desc HTTP/1.1″ 200 8075 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
66.249.66.35 www.cheaphotels-bristol.co.uk – [03/Aug/2009:23:31:52 +0000] “GET /?hotel_min_price=20&hotel_max_price=200&hotel_order=price_desc HTTP/1.1″ 200 73847 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
66.249.66.35 www.cheaphotels-bristol.co.uk – [03/Aug/2009:23:36:21 +0000] “GET /?hotel_min_price=80&hotel_max_price=60&hotel_order=name_desc HTTP/1.1″ 200 4998 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
66.249.66.35 www.cheaphotels-bristol.co.uk – [03/Aug/2009:23:37:51 +0000] “GET /?hotel_min_price=100&hotel_max_price=100&hotel_order=distance HTTP/1.1″ 200 8075 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
66.249.66.35 www.cheaphotels-bristol.co.uk – [03/Aug/2009:23:39:21 +0000] “GET /?hotel_min_price=40&hotel_max_price=200&hotel_order=name_desc HTTP/1.1″ 200 73847 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
66.249.66.35 www.cheaphotels-bristol.co.uk – [03/Aug/2009:23:40:50 +0000] “GET /?hotel_min_price=80&hotel_max_price=200&hotel_order=price_desc HTTP/1.1″ 200 28956 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
66.249.66.35 www.cheaphotels-bristol.co.uk – [03/Aug/2009:23:42:20 +0000] “GET /?hotel_min_price=60&hotel_max_price=100&hotel_order=name_desc HTTP/1.1″ 200 50589 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
66.249.66.35 www.cheaphotels-bristol.co.uk – [03/Aug/2009:23:46:50 +0000] “GET /?hotel_min_price=80&hotel_max_price=100&hotel_order=distance HTTP/1.1″ 200 21310 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
66.249.66.35 www.cheaphotels-bristol.co.uk – [03/Aug/2009:23:49:49 +0000] “GET /?hotel_min_price=40&hotel_max_price=300&hotel_order=name_asc HTTP/1.1″ 200 75359 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”66.249.66.35 www.cheaphotels-bristol.co.uk – [03/Aug/2009:23:21:23 +0000] “GET /?hotel_min_price=80&hotel_max_price=40&hotel_order=name_asc HTTP/1.1″ 200 4998 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
66.249.66.35 www.cheaphotels-bristol.co.uk – [03/Aug/2009:23:22:53 +0000] “GET /?hotel_min_price=60&hotel_max_price=200&hotel_order=distance HTTP/1.1″ 200 58235 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
66.249.66.35 www.cheaphotels-bristol.co.uk – [03/Aug/2009:23:24:23 +0000] “GET /?hotel_min_price=60&hotel_max_price=150&hotel_order=price_asc HTTP/1.1″ 200 56684 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
66.249.66.35 www.cheaphotels-bristol.co.uk – [03/Aug/2009:23:27:22 +0000] “GET /?hotel_min_price=100&hotel_max_price=100&hotel_order=name_desc HTTP/1.1″ 200 8075 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
66.249.66.35 www.cheaphotels-bristol.co.uk – [03/Aug/2009:23:31:52 +0000] “GET /?hotel_min_price=20&hotel_max_price=200&hotel_order=price_desc HTTP/1.1″ 200 73847 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
66.249.66.35 www.cheaphotels-bristol.co.uk – [03/Aug/2009:23:36:21 +0000] “GET /?hotel_min_price=80&hotel_max_price=60&hotel_order=name_desc HTTP/1.1″ 200 4998 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
66.249.66.35 www.cheaphotels-bristol.co.uk – [03/Aug/2009:23:37:51 +0000] “GET /?hotel_min_price=100&hotel_max_price=100&hotel_order=distance HTTP/1.1″ 200 8075 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
66.249.66.35 www.cheaphotels-bristol.co.uk – [03/Aug/2009:23:39:21 +0000] “GET /?hotel_min_price=40&hotel_max_price=200&hotel_order=name_desc HTTP/1.1″ 200 73847 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
66.249.66.35 www.cheaphotels-bristol.co.uk – [03/Aug/2009:23:40:50 +0000] “GET /?hotel_min_price=80&hotel_max_price=200&hotel_order=price_desc HTTP/1.1″ 200 28956 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
66.249.66.35 www.cheaphotels-bristol.co.uk – [03/Aug/2009:23:42:20 +0000] “GET /?hotel_min_price=60&hotel_max_price=100&hotel_order=name_desc HTTP/1.1″ 200 50589 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
66.249.66.35 www.cheaphotels-bristol.co.uk – [03/Aug/2009:23:46:50 +0000] “GET /?hotel_min_price=80&hotel_max_price=100&hotel_order=distance HTTP/1.1″ 200 21310 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
66.249.66.35 www.cheaphotels-bristol.co.uk – [03/Aug/2009:23:49:49 +0000] “GET /?hotel_min_price=40&hotel_max_price=300&hotel_order=name_asc HTTP/1.1″ 200 75359 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
66.249.66.35 www.cheaphotels-bristol.co.uk – [03/Aug/2009:23:21:23 +0000] “GET /?hotel_min_price=80&hotel_max_price=40&hotel_order=name_asc HTTP/1.1″ 200 4998 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
66.249.66.35 www.cheaphotels-bristol.co.uk – [03/Aug/2009:23:22:53 +0000] “GET /?hotel_min_price=60&hotel_max_price=200&hotel_order=distance HTTP/1.1″ 200 58235 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
66.249.66.35 www.cheaphotels-bristol.co.uk – [03/Aug/2009:23:24:23 +0000] “GET /?hotel_min_price=60&hotel_max_price=150&hotel_order=price_asc HTTP/1.1″ 200 56684 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
66.249.66.35 www.cheaphotels-bristol.co.uk – [03/Aug/2009:23:27:22 +0000] “GET /?hotel_min_price=100&hotel_max_price=100&hotel_order=name_desc HTTP/1.1″ 200 8075 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
66.249.66.35 www.cheaphotels-bristol.co.uk – [03/Aug/2009:23:31:52 +0000] “GET /?hotel_min_price=20&hotel_max_price=200&hotel_order=price_desc HTTP/1.1″ 200 73847 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
66.249.66.35 www.cheaphotels-bristol.co.uk – [03/Aug/2009:23:36:21 +0000] “GET /?hotel_min_price=80&hotel_max_price=60&hotel_order=name_desc HTTP/1.1″ 200 4998 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
66.249.66.35 www.cheaphotels-bristol.co.uk – [03/Aug/2009:23:37:51 +0000] “GET /?hotel_min_price=100&hotel_max_price=100&hotel_order=distance HTTP/1.1″ 200 8075 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
66.249.66.35 www.cheaphotels-bristol.co.uk – [03/Aug/2009:23:39:21 +0000] “GET /?hotel_min_price=40&hotel_max_price=200&hotel_order=name_desc HTTP/1.1″ 200 73847 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
66.249.66.35 www.cheaphotels-bristol.co.uk – [03/Aug/2009:23:40:50 +0000] “GET /?hotel_min_price=80&hotel_max_price=200&hotel_order=price_desc HTTP/1.1″ 200 28956 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
66.249.66.35 www.cheaphotels-bristol.co.uk – [03/Aug/2009:23:42:20 +0000] “GET /?hotel_min_price=60&hotel_max_price=100&hotel_order=name_desc HTTP/1.1″ 200 50589 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
66.249.66.35 www.cheaphotels-bristol.co.uk – [03/Aug/2009:23:46:50 +0000] “GET /?hotel_min_price=80&hotel_max_price=100&hotel_order=distance HTTP/1.1″ 200 21310 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
66.249.66.35 www.cheaphotels-bristol.co.uk – [03/Aug/2009:23:49:49 +0000] “GET /?hotel_min_price=40&hotel_max_price=300&hotel_order=name_asc HTTP/1.1″ 200 75359 “-” “Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)”
As you can see, Google seem to be crawling all possible combinations of the (select box only, GET) Ajax form on the site. Changing one parameter at a time. Is this something they’ve been doing for a while? Are Google starting to index what’s behind GET forms, or all forms? Most importantly, I guess, will these pages show up in the rankings and / or affect the site’s rankings as a whole?

Update: I’ve just found a post about this on the Google Webmaster Central blog. It says: “Specifically, when we encounter a <FORM> element on a high-quality site, we might choose to do a small number of queries using the form.” and “Only a small number of particularly useful sites receive this treatment, and our crawl agent, the ever-friendly Googlebot, always adheres to robots.txt, nofollow, and noindex directives.”  Quite the compliment from Google that they consider my sites ‘high quality’, and ‘particularly useful’, I’d say!

It does seem from the above that I’m way behind on catching onto this, as the post linked above is from April 2008. Still an interesting discovery for me though.

Leave a Reply


Notice: Undefined index: HTTPS in /home/dan/public_html/wp/wp-content/plugins/stats/stats.php on line 111