Caffeine Indexing, What is it?

If you are not in the SEO or SEM industry or if you have taken a bit of vacation this summer, you will sometimes find that it's hard to keep up with the many innovations that Google, Bing, Yahoo and others are offering to their users on the web.  Google for instance has made significant innovations not only in search but also with social media.  Since there have been so many changes happening a couple of things might slip through the cracks or not be on a web developers radar.  One item that I was recently asked about is Google's Caffeine Indexing, which has been out for a while, but not on everyone's radar.  So, what is Caffeine Indexing?

 

Google processes lots of data. How much? It's not exactly known, but there are billions of web pages and millions are updated each and every day.  This actually poses a problem for Google whose job it is to scour the web and deliver the most relevant and fresh content to its users which request it with millions upon millions of queries.

 

Batch Approach of Indexing

In the past, Google indexed the web via a batch approach.  This means that Google would first crawl the web of billions of documents, before it would ever index its first document.  The time it took between crawling and the first index was significant (possibly even 24 hours).  So Google realized there was a latency problem.  This was especially apparent when big news events happened.  Many current events saw many hours go by after an event happened before the first documents were indexed.  This obviously wasn't the kind of service that Google wanted to give to its users which is why it came up with Caffeine Indexing.

 

Caffeine Indexing

Caffeine Indexing is quite an innovation compared to the batch approach of indexing.  Compared to waiting for the entire web to be crawled before a document is indexed, Caffeine incrementally indexes the web and immediately indexes what it crawls.  This means with Caffeine there is no or very little latency.  Now, as Google crawls it indexes making millions upon millions of documents searchable by the user immediately.  So, in cases of big news events, new documents show up extremely quickly, sometimes in a few seconds after being indexed.  Caffeine also has helped to innovate instant search, so instant searches are more fresh and relevant. 

 

For SEO and SEM Professionals

One thing to note besides what is written above is that in addition to Google indexing the web faster, the SERPs can on whole be much fresher.  Matt Cutts has even said that the index can be up to 50% fresher which means that those that create fresh, relevant content regularly can have an advantage of being indexed into the SERPs - not only at a quicker rate, but possibly at higher positions. 

 

This is why content creation on a regular basis can be hugely beneficial to a website.  For those that want to take advantage of Caffeine Indexing, make sure you post to your site or blog regularly- at least once per week.  In addition, make sure that any new web pages you create are easy for Google to find.  Sitemaps submitted to Google are essential, because what is the point of trying to gain the immediate attention of Google, if it takes a few days or longer for Google to find the new content.  You can view the video of Matt Cutts discuss Caffeine Indexing at the following link: http://www.youtube.com/watch?v=fInTTR8lLS4&feature=channel_video_title

About the author

Roger Janik is the President and Founder of ServerSideDesign.com – The Web Marketers.
He began working as a professional web designer and web marketer in 2001, holds a BA in Communications from UHCL and sits on the marketing committee of the Houston BBB. In addition Roger is a frequent guest on Houston FOX News and CBS Talk Radio discussing the current trends in website marketing and social media. He founded ServerSideDesign in 2004 and has established his company as a leading provider for Search Engine Marketing Services in Houston, TX. as well as on a global scale.

Comments

Post new comment

The content of this field is kept private and will not be shown publicly.
CAPTCHA
This question is for preventing automated spam submissions.