"...Google will determine..."? Not on my site!
September 24, 2008 4:49 PMIn the StomperNet forums today I responded to a member who noticed a Google post here. Reproduced here is my acidic response.
That was the most useless, vague, non-actionable and *irresponsible* post I have EVER seen from Google. It looks like something from webmasterworld or the warrior's forum. The examples used are just plain stupid and the sweeping generalization they make about Google somehow figuring out URL parameters is dangerously silly.
- No one would consider rewriting a (so-called) dynamic url into a "static" one while retaining the session id. I mean DUH! If you are smart enough to even be able to enable mod_rewrite how could you not know to turn off session ids when serving content to bots? Ridiculous example that serves to paint all rewriting as somehow dangerous. Worst still, why would anyone rewrite like the example shown? That's plain stupid.
- " ... Google will determine which parameters can be removed ..." -- You have got to me Sh*t**g me! Is there anyone who can spell S-E-O that would like to just simply trust Google to "determine" what URLs should be the same and which should be different?? Not me thanks. My site. I'll decide. If they get it wrong, you get flagged with widespread duplicate content and they don't tell you about it.
- They leave completely unanswered the OBVIOUS (just look at SERPs) problems they have today with session ids -- not so good at "determining" after all, eh? At every single StomperNet Live event we've held, I have reviewed at least one site that had pages indexed at Google showing multiple different session id values. This is a widespread problem for sites that serve session ids to bots and for Google to publicly post about "dynamic" URLs and sweep this under the rug while vaguely claiming to handle it borders on misrepresentation.
- They also don't say a damn thing about parameter order -- another place they fail COMPLETELY to "determine". Example: p1=v1&p2=v2 leads to the same content as p2=v2&p1=v1 and this is a REQUIREMENT of the HTTP spec (named parameters are NOT positional so may appear in any order) but Google treats these as different URLs and will ignorantly and incorrectly index both URLs as different pages. This problems appears in several CMSs today, Endeca in particular has it bad.
Re: "...Google will determine..."? Not on my site!
Re: "...Google will determine..."? Not on my site!
If someone at Endeca would like to engage in a technical discussion with me, there is actually a relatively simple fix that provides SEO friendly URLs while also preserving the very high end site search functions that Endeca is justifiably renowned for.
Re: "...Google will determine..."? Not on my site!
Being from Endeca and having worked on both crawling technology and e-commerce sites for many years, I can absolutely understand your point of view. You're spot on!
Traditionally, Endeca's out-of-the-box components for URL generation have been very SEO-unfriendly. This constitutes a significant percentage of our live customer base. Recently, we've introduced highly extensible and configurable components for SEO URL generation. In addition to fixing URL canonicalization and parameter order, we've also introduced URL keyword control. The customers who've adopted this technology have noticed marked improvements in their SEO rankings.
Yes, we've made mistakes, but we're actively improving in this area.
-Chip
Re: "...Google will determine..."? Not on my site!
Sorry for the radio silence. Here's an example customer:
2) Click on "Designers > Vera Wang"
URL: http://www.bluefly.com/Vera-Wang-Women/_/N-apt8Z1pqkZ1z140iw/designerslist.fly
3) Click on "Color > Red"
URL: http://www.bluefly.com/Red-Vera-Wang-Women/_/N-1z140nnZ1z140iwZ1pqkZapt8/designerslist.fly
4) Click on "Category > Evening Dresses"
URL: http://www.bluefly.com/Red-Vera-Wang-Evening-Dresses/_/N-1z140nnZ1z140iwZfimZapt8/designerslist.fly
This shows how the upgraded URL generation logic works when navigating throughout a multi-dimensional space (i.e. Endeca Guided Navigation). In e-commerce, apparel is a tough category for ranking (very competitive), so I'm not sure how well Bluefly is doing.
We're actively looking to make SEO better on both the application and tooling side. The upgraded Bluefly URLs are better, but know that we can do much, much more (for URLs and other aspects of SEO). The key for us is that we have a really tight search and navigation integration, so we can leverage this to make better URL serialization part of the package.
FWIW - Here are some other customers who've adopted the URL optimization module:
http://www.armstrong.com/commflooringna/products/linoleum/gouge-resistant/_/N-67qZ70s
http://www.keljob.com/recherche/transport-logistique/_/N-67vZ6qc?
http://search.espn.go.com/ramirez/mlb/boston-red-sox/stories/43-4294747985-5
The ESPN example shows how you can extend the framework to add your own URL serialization logic.
Final thought, you may want to try alternate paths to the same 'navigation state' (as we call it). Unless our customers have disabled URL canonicalization, you should have absolutely identical URLs no matter how you navigated to that state.
Best regards,
Chip

A programmer since 1974, a business owner since 1988 and a webmaster since 1999, Leslie is currently focused on providing webmasters with leading edge technology to advance their online businesses. Leslie's teachings and tools are actively promoted by a varitable who's-who of search engine marketing and he has been the "man behind the curtain" defining the SEO strategies for a number of successful web businesses.





