Search Engine War Blog : « Guava's Mum | Google Keyword Tool Accuracy - Update »

How Accurate is the Google Keyword Tool’s Volume Estimates?

Friday, 25 July 2008

Keywordestimation

There was a fair amount of fuss made about Google’s decision to include ‘real’   numerical volume data in the keyword tool update (released 07/07/2008). Many   people shouted hoorah at a new age of openness from Google, and twice as many   huffed and puffed and dismissed it as inaccurate off hand. But almost no one   published the results of their tests with any numerical visibility.

So how do we test it? In theory we just need to find a keyword with a 100%   impression share for a search term, and see whether its impressions figure for   June matches with the figure given in the tool. But it was never going to be   that simple...   

There is no way within the Google interface (to my knowledge) to establish   if a keyword has 100% impression share, unless that keyword has been placed   is in its own adgroup. Our accounts only have one-keyword-adgroups in very rare   circumstances as, even with high volume exact matches, there are normally a   handful of terms that group nicely together, be they plurals or common variations   (such as the ever present “keyword UK”, that always gets volume but rarely justifies   its own adgroup from a quality score point of view for a uk targeted .co.uk   site).

So what keywords is it reasonable to assume have a 100% IS and match as closely   as possible to Google’s total keyword volume? In theory a keyword has to fulfil   the following requirements to be eligible for this test:   

     
  • It needs   to be completely uncapped, its display unrestrained     by any budget or  adscheduling.
  •  
  • It needs  to be active on BOTH Google and the search network, as the     keyword tool’s  figures include both.
  •      
  • It needs  to have the same language and location settings as the tool,     which in our  case needs to be English, United Kingdom. The tool as yet     cannot be set to  include region targeting (and if the woeful inaccuracy     of the traffic  estimator tool when it comes to UK region targeting is     anything to go by,  I’m not sure I’d use it if it did!).
  •      
  • It needs  to be running in 1st position, at all times. This     is because the  search network includes ‘search results’ on sites that     have limited space  to display ads (eg ebay) and show only the top few     results.
  •      
  • Finally  the keyword needs to be set to Exact Match. In theory this     isn’t an  absolute necessity, you could achieve a 100% IS on a broadmatch     keyword,  and the tool’s results are filterable by all match types, but     i think the  current ‘Expanded Broad Match’ relevancy lottery that Google     seems to be  running (perhaps a subject of future post), as well as variation     caused by   negative keywords, etc, will only add further inaccuracies     into the test.

The type of keywords that immediately spring to mind when we’re talking about   ‘always top, always displayed’ terms are those related to brand. Many of our   clients need to be shown as the top result when a search on their brand is performed,   whether due to competition against an untrademarkable brand, poor positioning   within the algorithmic listings or to promote a specific campaign. Of course,   brand terms are often low volume, but fortunately a few of our clients have   very generic keywords as their brand names, so I was able to proceed with eight   keywords of varying volume levels that should have received, as far as possible,   a 100% impression share.

I’ve hidden the identity of the keywords, but the figures are real.

                                                                                                                                                                                                                                                                                                                                                                                     

          Keyword
           
          Actual Impressions June 08 

          KW Tool 'Approx Search Volume: June'
           
          difference from estimate   
           
          KW Tool 'Approx Average Search Volume' 
           
          difference from estimate 
         
        Generic Keyword A 
         

58807 

       
         

22000   

       
         

167% 

       
         

74000 

       
         

-21% 

       
           
        Generic Keyword B 
         

44397   

       
         

33100 

       
         

34% 

       
         

74000 

       
         

-40%   

       
         
        Generic Brand A   
         

25088 

       
         

12100 

       
         

107% 

       
         

27100   

       
         

-7% 

       
         
        Generic Brand B 
         

9704 

       
         

8100 

       
         

20%   

       
         

8100 

       
         

20% 

       
        
      Generic Brand C 
        

8504 

      
        

8100   

      
        

5% 

      
        

9100 

      
        

-7% 

      
        
      Niche Brand A 
        

2582   

      
        

1600 

      
        

61% 

      
        

2400 

      
        

8%   

      
        
      Niche Brand B   
        

2377 

      
        

1300 

      
        

83% 

      
        

1600   

      
        

49% 

      
        
      Niche Brand C 
        

1104 

      
        

720 

      
        

53%   

      
        

1000 

      
        

10% 

      

Many of these differences are huge, with both “Generic Keyword A” and “Generic   Brand A” having more than twice the volume Google estimates. Interestingly,   the ‘average search volume’ is closer to the actual June figure with all but   one result. I wasn’t going to include the ‘average’ statistic, theorising that   it surely takes into account 12 months data to allow for full seasonality ,   but the fact it seems more accurate means it needs to be discussed (more on   this below).   

Why the inaccuracy? Well, we must take into account Google’s manipulation of   the figures. Google admits to only giving ‘approximate’ data, but goes one step   further than simply rounding - it actually groups all keywords into sets of   volumes. No matter what search query you put into the tool its volume will always   be one of a fixed set of results.

For example, with the kind of volumes we’re dealing with (between 1,000 and   100,000 impressions) Google allows only the following results from the keyword   tool:
  90500, 74000, 60500, 49500, 40500, 33100, 27100, 22200, 18100, 14800, 12100,   9900, 8100, 6600, 5400, 4400, 3600, 2900, 2400, 1900, 1600, 1300, 1000, 880,   720

Lets put those estimations against our keywords. I have rounded each actual   keyword volume DOWN (as all of the results exceed the estimations) to the closest   available result. The table now looks like this for June:   

                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           
 
        Keyword 
   
        Rounded Down June 08 Impression figure 
 
        KW Tool 'Approx Search Volume: June'   
 
        difference from estimate 
         
        Generic Keyword A 
         

49500 

       
         

22000 

       
         

125%   

       
         
        Generic Keyword B   
         

40500 

       
         

33100 

       
         

22% 

       
           
        Generic Brand A 
         

22200   

       
         

12100 

       
         

83% 

       
 
      Generic Brand B 
        

8100 

      
        

8100   

      
        

0% 

      
 
      Generic Brand C 
        

8100 

      
        

8100 

      
        

0%   

      
 
      Niche Brand A   
        

2400 

      
        

1600 

      
        

50% 

      
   
      Niche Brand B 
        

1900   

      
        

1300 

      
        

46% 

      
 
      Niche Brand C 
        

1000 

      
        

720   

      
        

39% 

      

As you can see two keywords, “Generic Brand B” and “Generic Brand C”, are now   correct, but all the others were underestimated by more than one category. This   isn’t such a big issue at the lower end of the volume spectrum, where sets are   closer spaced, but for higher volume keywords this makes the data wildly inaccurate.   Generic B was only one set out, but this is still 10,000 impressions, potentially   hundreds of clicks a month.

The impression data that the tool produced for June is not accurate, even once   Google’s rounding is taken into account. Does this make it useless? Well, no,   of course not. First of all you’ll notice that, with the exception of “Generic   Keyword A”, the keyword’s relative positioning in the volume list was correct.   Secondly, the average volume was more accurate, and this is the column most   will use for all but seasonally affected campaigns.

This disparity between the accuracy of the average impressions and the June   impressions suggests to me that Google has manipulated the results. The way   in which the keywords are grouped into sets for volume might suggest they are   grouped in other ways, perhaps at a thematic level, and if June was a poor volume   month in certain sectors (which it was) Google might moderate the results in   these sectors down. Its much more dangerous for Google to overestimate volume   than underestimate volume – a new advertiser may be put off bidding on a high   volume term, or may limit their bid, budget or location targeting because of   a high volume estimate, therefore it makes sense for Google to cautiously underestimate   impression volume, as evidenced by the fact all June results were less than   the actual impressions.

The final factor is the location/language setting set by the user. How accurate   is Google’s location data? Does it identify UK volume in the same way it identifies   UK users when ads are displayed, or does it use a more simplistic method? The   only way to find out would be to look at data from accounts in other regions,   particularly the default US, and see what trends are observed. Additionally   I believe it perfectly likely that if modifications are made to the volume figures   of keyword groups, as suggested above, then these modifications are probably   based on US data and rolled out across other markets (particularly if the language   is the same).

I hope this has helped give an insight into the accuracy of the data the tool   produces. Please let me know what trends you’ve seen – i’ll update this when   the July figures are published and see if my observations carry through. I still   think this level of visibility is much better than the old relative volume bars   – if nothing else we can use the results to show a more accurate relative volume   than the old 1-5 scale. Use it, with caution, and assume more volume than estimated.     

Comments

Evan W.

Great post...I have been using this tool and I think it's fairly accurate. I have always thought Google doesn't do enough with their keyword tools!

Evan W.

Great post...I have been using this tool and I think it's fairly accurate. I have always thought Google doesn't do enough with their keyword tools!

Jowan

From my experience the higher the volume the more Google over estimates it, devide them in half and you're about correct. On mid range terms it is as good if not better than any other tool. On really low volume then Google will says is doesn't have enough data but wordtracker will at least have some figures.

For the purpose of estimating all of them should be taken with a pinch of salt.

Seo Services Company Pune

Nice post....great amount of information..although I will have to go through it again.

ronnydidit

I don't know about the search volume data, but I sure know a lot about the Advertiser Competition data, which is inaccurate to the extent of being ridiculous!
Go to my blog and I'll prove it to you.

Erin Borrini

Wonderful post! This is in line with similar research I conducted when Google released the new numbers—I found the same search volume groupings and values as you indicate. I also went a step further to see if there was any “theme” behind the groupings. I found that each value (most consistently in groupings that fall between 2,900 and 823,000) is approximately 82 percent (give or take 1 pct) of the next highest (2900/3600; 3600/4400; etc.). Across all groupings, this relationship between the values ranges from 75 percent to 85 percent. I thought this correlation was interesting, but I’m not sure it goes further than that. Any thoughts on what this might mean or how it might be useful from SEM perspective?

Also, I was curious about your impression statistics—do those come straight from Google Adwords, another tool or your own calculations? I’m not familiar with the advanced features of Adwords, so I might be missing out. Still, impression share seems a valuable way to analyze keyword volumes. Thanks for your insights and great job on the analysis.

Matt Whelan

Thanks Erin - the impression figures are straignt from Adwords, as mentioned it would have been better to use one keyword adgroups so we could monitor Google's "official" impression share, but they weren't available. IS is a great analysis tool and i wish they provided it at keyword level (but that would give too much away wouldn't it!)

Your 82% increase figure is very interesting - i'm not a mathematician, but there is a sequence here and it must relate to something. A collegue sent me Brian Turner's speculations (http://www.internetbusiness.co.uk/08072008/google-reveals-search-volume-figures) that it might be related to Google grouping keywords for algorithmic purposes. Let me know if you find anything!

bdo

I am trying to rank keyords by UK volume only.

But its not working.

I'm using the keyword tool from within a Adgroup with UK country specific location chosen - and therfore it should pick up that I want UK stats. It does not - proof is that the US word attorney shows a volume for 6 million vs 1.5 million for solicitor - therefore it is drawing on USA or worldwide stats.

Am I doing something wrong?

Matt Whelan

bdo, even from within UK accounts you'll notice that "Results are tailored to English, United States" by default (above the provide feedback line). You'll need to edit this to English, United Kingdom before you run your search.


Not sure why they cant make it default to the language and location settings of the account, would be very useful!

hi

nice and informative post,
thanx

Sam Sinton

Great post.

Google do not like people using keywords that are not often searched (by large volume) as it causes an unnecesary strain on their systems. so in an attempt to put would be advertisers off using them altogether they underestimate their search volume.

Google wants everyone to bid for popular terms as this pushes up the prices and they make more money.

Robin F

Interesting data! I wouldn't use the keyword tool to calculate bids, but your data gives me confidence that there is sufficient accuracy in the estimations to make it a very useful tool for finding out potential keyword phrases.

UK web hosting

Great post, thanks for the info. I'll need to read it again though, as there is quite a lot to take in. I was very suspect of the keywords figures until I read this.

naila

a very interesting approach to finding out if they Google adwrod tool data is precise. It seems that we shoudl all apply some kinf of inudstry filter of error margins on each term we look up and use the data with a pinch of salt.

Hassam

Great post. Google keyword tool is accurate for all practical purposes and the best thing it is free. So use it in finding how popular a particular keyword is.

Rory

I have been using the keyword tool for a while. I think it's quite inaccurate as I have established a top ten position for a keyphrase that apparently receives 245,000 search queries per month, yet I only receive 1-2 visits per day from that particular phrase per day.

Conrad

From my research I'd say that a lot of their estimates are out by 1000% +.

ismail

I have keyword Tool That calculated actual Volume
Amazing Tool

J

It is a crap tool in terms reporting actual visitor numbers.That is just to trap bidders ..Just use it for the purpose of keyword idea..

facebookgolddigger.com

I blog for people and not to feed the search engines.

Pete

I'm glad it isn't just me! I have come to the same conclusion as others - the keyword tool is next to useless for estimating traffic.
In defense of Google though, their new webmaster tools are good.

generic viagra

I Think the search engine is a great invention, because, in my case, i love to surf the web, and a search engine is essencial!!!
'course the google is the best one, yahoo is a good one too, but a MILES away from the google supremacy!!!
Can't wait to see the new Google Caffeine, i'm so excited!!!!!!

Web Designer

I think that this proves we are doing the right thing by using the Google Keyword Tool as a guide and not as the gospel. Using this in conjunction with your adwords ppc analysis and google analytics feedback should all work well as part of a keyword targeted online marketing campaign.

yournetbiz

great post here, although im not sure if the google keyword tool is 100% accurate, i have run a test by using the particular keyword in phrase marks and running a campaign in adwords to see the daily search impressions to see how accurate it was compared to the keyword tool

Photo Mugs

Additionally ,We believe it perfectly likely that if modifications are made to the volume figures of keyword groups, as suggested above, then these modifications are probably based on US data and rolled out across other markets.Thanks

Post a comment

If you have a TypeKey or TypePad account, please Sign In.

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d83451c37d69e200e553cf1bbc8834

Listed below are links to weblogs that reference How Accurate is the Google Keyword Tool’s Volume Estimates?:

Subscribe to this blog's feed

Add to My Yahoo!
Subscribe with Bloglines
Add to Google
Subscribe in NewsGator Online

Add to My AOL
Add to Technorati Favorites!
Add to netvibes