Correlation studies of Google search results have a long tradition in SEO. But they also may be unreliable for providing insights into how Google ranks web pages. Using insights from correlation studies in an SEO strategy may result in poor decisions.
Bill Slawski On Pitfalls of Correlation Studies
I asked Bill Slawski (@bill_slawski) about correlation studies and he pointed out several reasons why those studies may lead to inaccurate conclusions.
This is what Bill said:
“One issue is the placement of search results that are augmented results taken from knowledge based upon an entity in a query.
A correlation study may have difficulties explain anything about those without an awareness of how that augmentation process works, or One Boxes, which aren’t in search results because of the number of backlinks pointed to them.
The same with top stories at the top of search results. They aren’t there based on backlinks.
The data in correlation studies may be cleaned so that one boxes and featured Snippets don’t appear within them, but it’s been a long time since we lived in a world of ten blue links.
Not sure I have seen a correlation study that covers those other types of results well at all.
How would a correlation study include information about top stories which are triggered in the SERPs depending upon the query?
Google has a number of triggering events happening based on the query. Has data been cleansed to remove those, since they may not fit into the conclusions of the study? I suspect that may take place in a lot of studies.
The manipulation of the data is unfortunate because it means the results of the study mirror the conclusions set for it before it is run.”
Some correlations studies will find that having more links correlates with the number one ranked sites. Other correlation studies have discovered ideal anchor text ratios and how many links should be pointed to the home page.
But there is a problem with these kinds of findings.
Positions 1 – 10 of Google’s search results are often ranked for different search intents. The classic example of this reality in the SERPs is the search results for the search phrase, jaguar.
One way Google ranks the SERPs is according to the popularity of search intent. In the case of the search phrase, jaguar, the most popular search intent is that of the automobile.
The reason that the web page for jaguar the animal does not rank in the top spot has nothing to do with the amount of backlinks or anchor text ratios. It’s excluded from the number one position because the search intent for animal does not match the search intent for automobile.
Related: What Is a Natural Link?
A search for jaguar could be a search for:
- a car
- an animal
- a football team.
- news about the car
- reviews about the latest car
- videos about the car
The above are examples of what Google currently ranks in the top ten for the search query, jaguar. That’s six search intents just for the first page of Google’s search results pages (SERPs).
The above is an example of search intent diversity. Search intent diversity is a big reason why correlation studies are unreliable.
Search Intent Diversity
There is a diversity of search intent for just about every search query.
The more vague the search query the more likely Google will show navigational search features like People Also Ask, which further complicates correlation studies.
The old way of search results that are ten blue links are a thing of the past. But correlation studies treat the SERPs as if they were still ten blue links. That’s another way that correlation studies are flawed.
Correlation studies ignore the reality of search intent diversity and many other modern day search features.
This is an example of what a typical search result might resemble:
- Position three might be a result about How to Do A.
- Position four might be about where to buy A.
- Position five might be about reviews of A and their competitors B and C.
- Position six might be about latest news about A.
In the above example each site is ranked not because of the amount of links. They are ranked according to the most popular search intent.
It’s like the search results for the search term Jaguar. The top result (an automobile site) isn’t there because it has more backlinks than the Wikipedia page for jaguar the animal.
The top result is there because the most popular search intent for the phrase jaguar is a web page about Jaguar automobiles.
The backlink counts between positions one, three, four, five and six are totally irrelevant to the reason why those pages are ranked in those positions. There are generally multiple search intents for any given search query.
Consequently, any correlation study that draws conclusions from the top ten or top twenty of Google’s search results is going to yield information that does not accurately reflect how Google ranks web pages.
To try to get a more accurate result, a research study would have to first identify the different intents and assign them a classification like Informational, Transactional, Educational, etc.
But even if that was done, there is still a flaw. The search intent classifications will not match the search intents that Google used to create any given SERP.
Related: How People Search: Understanding User Intent
Nobody Can Reverse Engineer Search Results
Unraveling Google’s search results through correlation studies is not as simple as correlating ranking factors with millions of SERPs for the reasons outlined above.
Correlation studies have always been unreliable. Yet many people continue to believe in them. They make great clickbait.
But perhaps it’s time for the SEO industry to grow up and set them aside.
Related: How Search Engines Work
Share this post if you enjoyed! 🙂