Getting Faster, Fresher Search Results from a Google Search? Caffeine is the Culprit

by Tony Gonzalez June 10, 2010 12:11 PM

Google announced this week that its new web indexing system, Caffeine, has been live for several days now, and should be providing at least 50 percent fresher web search results than its old indexing system. Previously, there was sometimes a fairly long delay between posting a website update and having it appear in search results on Google. For self storage companies and other small businesses, this means that updates to websites or to blog entries should appear in Google search results much more quickly than they did before.

For self storage companies and their prospective tenants, this could be important news. According to Michael Schulman of StorageClicks.com, who presented at the Massachusetts Self Storage Association’s Northeast 2010 Trade Show, there are 823,000 web searches for “self storage” every month. According to an article by Chuck Gordon, the CEO and cofounder of online self storage marketplace SpareFoot.com, in last November’s Inside Self-Storage, some self storage companies say that they get as many as 50 percent of their new tenants online.

On the other hand, not all web searches for “self storage” are done on Google. In fact, a Google search for “self storage” today will bring up only around 4,620,000 search results -- not nearly as many results as will come up in Yahoo for the same search (31,401,512) or in Bing (190,000,000).

Consumers who rely on Google, however, may appreciate Caffeine’s increased indexing speed, because the pages that are most commonly updated by self storage companies are the ones that contain current deals, coupons, and discounts.   

Google’s old indexing system updated its content one layer at a time, and some layers were refreshed faster than others. Content was not added to Google search results until its layer was refreshed. As Search Engine Land’s Vanessa Fox describes it:

"Previously, Google’s crawling and indexing systems worked as batch processes. Googlebot would crawl a set of pages, then process those pages (extracting content from them, associating data about them, such as anchor text and external links, determining what those pages were about), and finally add them to the index. While this system was continuous, all the documents in the batch had to wait until the batch was processed to be pushed live."

Caffeine, on the other hand, analyzes the web in small portions. It does not search and index the entire web at one time. So instead of having to index an entire layer of the whole web in order to add new content, Caffeine can add new pages directly. No one page has to wait for an entire batch to be indexed. Caffeine can index pages and add them to Google searches so quickly that, according to Google, “if this were a pile of paper it would grow three miles taller every second.”

According to Google representative Matt Cutts, who spoke at the Search Marketing Expo in Seattle Tuesday, “Caffeine allows us to process data on the order of 100 petabytes.” He went on,

"What is a petabyte? A petabyte is 1,024 terabytes -- so, more than a million gigabytes. And there’s 100 petabytes of information, that scale of information, going into [Caffeine]. So, it’s a lot more data, it allows a lot of flexibility, but fundamentally the change is that as soon as an object gets crawled, boom -- it can get indexed."

Google also claims that Caffeine can do more than just index pages faster than its old indexer. According to Google, Caffeine can store more details about each page. While this change may not affect searches that are happening today, this week, or this year, it may impact searches in the near future, as the web changes and developers begin to write new code that offers other details about web pages. Caffeine, in theory, is set up to immediately start processing whatever new details become available.

Caffeine uses almost 100 million gigabytes of electronic storage. It adds hundreds of thousands of gigabytes worth of information every day.

Google has been testing Caffeine since late 2009.

Sources used:

Eaton, Nick. “Google overhauls web indexing with ‘Caffeine.’” The Seattle Post-Intelligencer. Blog. June 9, 2010.

Fox, Vanessa. “Google’s new indexing infrastructure ‘Caffeine’ now live.” Search Engine Land. June 8, 2010.

Gordon, Chuck. “Getting your self-storage website noticed: SEO, SEM and third-party referrals.” Inside Self-Storage. Nov. 20, 2009. 

Jacobsson, Sarah. “Google’s Caffeine gives the search engine a boost.” PC World. June 9, 2010.

Osmeloski, Elisabeth. “SMX Video: Google’s Matt Cutts on Caffeine launch.” Search Engine Land. June 9, 2010.

Schulman, Michael. “Self storage Internet marketing solutions.” Summary of presentation from Northeast 2010 Trade Show of the Massachusetts Self Storage Association.