Friday, June 17, 2005

SEOmoz | Google's Patent: Information Retrieval Based on Historical Data

SEOmoz | Google's Patent: Information Retrieval Based on Historical Data

This above document must be read by anyone who hopes to get high in Google's results pages.
Everything in the patent comes straight from Google, no theories, or rumours.

Although most optimization is common sense, like the fact that search engine spiders can't read pictures etc. In this document you can actually see the other factors that Google is worrying about. There are many - and not all are under your control. That is the way Google want it to be. At the end of day, in Google's eyes, the best ranking for your site is wherever the user wants it.

If Google think it is relevant to their search (and they will try and avoid trickery), it will be somewhere on the first page. If not, it might be somewhere at the back, or even not at all.

In my experience it took a relatively small site about 6 months to break to the number one spot, although it had no serious competitors, and was the 'authoritative' source of information on that subject.

Think about your site. Which category does it fall into?

Fast changing
Dynamic content sites, such as news, active businesses, campaigns etc. fall into this category. Often it is hardest type of site to optimize effectively, as it usually involves dynamic content, which is hard to control. These are the most common site that the user searches for, but they are not always wanted...

Static, Informative

Static pages are what they say. A static site is usually more information and fact-based. It is not keeping up-to-date with any current affairs, but will remain (roughly) in the same state for generally a long time. These sites are often the best sources of information when the user is looking for an answer to a particular question.


Let's follow a few users on their internet travels, and see what they are searching for, why, and what category site Google should return for them:

User A

User A wants to find out how rice krispies are made.
The first start for the average user would be typing 'rice krispies' into the search box. This leaves him with the top result - you guessed it! Kelloggs Rice Krispies home page. This is a static, informative site (as informative as it gets :-) ), which does not change rapidly.

User A refines his search to 'rice krispies made'. This, sure enough, strikes gold.
Howstuffworks is a very informative site, though even those change. At the time of writing, this particular article was added two days ago. For this reason Google may favour it, thinking that it is a dynamic, current-events site, and we are searching for a current-event (see User B). Google would be wrong.

An interesting note - if User A had typed 'how are rice krispies made?' Google ranks howstuffworks at number one. Perfect. This is odd, because usually Google leave short, common words like 'how' and 'are', but it didn't?! AskJeeves have always encouraged people to input questions, although their search algorithm works no differently. It does not interpret those questions, it just tales the keywords out. Perhaps Google are working on the question side of things themselves?

User B

User B is a keen cyclist. He wants to follow the Tour de France, from his home, using the internet. He searches for 'Tour de France'. What should Google give him? A static page?, that would inform him about what the race was, its history, and how it started. This is not what User B expects. Alternatively, a dynamic site, with the current news, and how far the race has gone. This is what User B is looking for. Google will probably work this out by deciding if 'Tour de France' is a current hot topic. If lots of people are searching for it suddenly (Google Zeitgeist), then Google know it must be news-related. They offer news stories on the first page.

User C

User C is doing a school project on the race. Unfortunately he doesn't want the news, but the static sites, unlike User B. He would find the Wikipedia entry further down the page, but at any other time of the year this would probably be ranked quite highly.


The lesson to learn from all of this is that there is no point in competing with sites that aren't in your category. Google will always see the two as different. You need to be able to identify your site, and identify your competitors. Concentrate on having the best content, covering your subject(s) in full. Well written pages (as in text, not just code) will definitely rank higher than poor spelling and grammar.

Good luck, and keep fighting for that #1 spot, just like we all are... :-)

0 comments: