Understanding Google PageRank
The Google PageRank algorithm is more complex than most SEO commentators would suggest. The secrets of this crucial search algorithm are also closely guarded. The formulation has also no doubt changed over time in the pursuit of better results and changing web practices. At its heart, PageRank is a simple voting mechanism, every link to another web page is a vote – but not every vote is equal in the eyes of Google. This post aims to demonstrate both the importance of Google PageRank to the success of your web-page (it is page based, not site based) and also provide some understanding of how PageRank works in a practical sense.
A Caveat for this Post
The post comes from years of observation rather than any insider knowledge. Although I have read what I can and spoken on a number of occasions to tight-lipped Google staff, this article is speculative rather than definitive, fitting theory to observation. You will need to take my findings as they are intended – a working practitioners externally constructed view of PageRank. This post is also best considered in combination with my earlier entry – Ten Things That Matter For Search.
PageRank Bubble Diagram (explained below)
Using the PageRank Bubble Diagram …
STEP ONE – PAGERANK (Vertical Index)
To use this for your own requirements you will need a way of determining the PageRank of a given web page. There are a number of reporting sites and programs that will give you the value, however the easiest way is to install the Google Toolbar, visit the page in question and hover over the PageRank icon in the Toolbar and read off the resulting score (from 0 to 10) – I will talk more about the score later. For now, you have a number. As an example, look at www.apple.com – at the time I wrote this piece, it has a PageRank of 9 (very strong).
STEP TWO – CONTEXT STRENGTH (Distribution Balloon)
In this step, you need to consider the context (specific meaning) strength of the web page. This is the relevance of the content on the page to the word or phrase that is being searched for using Google. As we discussed in Ten Things That Matter for Search, different page elements have different contextual impact. In this diagram, we are using an arbitrary scale from 1 to 100 (1 is at the far right of each ‘balloon’ and 100 is at the far left or ‘tail’). The reason each ‘distribution’ has a tadpole-like shape is that the bulk of pages will be in the lower reaches (score wise) and it is only pages fundamentally focused on a given key word or phrase (context) that will be positioned on the right-hand-side (closer to 100). Scoring is ‘very subjective’ (a bit like scoring Chess pieces) but as an example, try the following (we will use Steve Jobs as our example – replace Steve Jobs with whatever key phrase(s) you wish to assess).
Domain Name … www.stevejobs.com … 50 points
Page Name within the URL string … http://www.xyz.com/SteveJobs … 20 points
Page/Window Title (first 65 characters) … 8 points
First H1 Heading on the Page … 5 points
Opening Body Text (first 155 characters) … 4 points
Additional mentions in body text (max eight) … 1 point
Links from Incoming Strong Pages (with context) … 1 point (max)
Metadata, Image Names, ALT Tags, etc. (max 8 = 4 pts) … 1/2 point
(this is all just indicative – the actual ‘context’ algorithm is unknown)
In my ‘Apple’ example, there is no direct mention on Steve Jobs but there is 1 point from incoming related pages. So I have the Apple home page in the PageRank ‘9’ balloon at the extreme left-hand-edge (context score of 1).
STEP THREE – COMPARISON (Other Pages)
Repeat the above two processes for any comparable pages. In my example, I am going to use the Wikipedia page on Steve Jobs, you can view it here. The result at the time of drafting was: Step 1 – PageRank of 7. Step 2 – Context Score of 42 (from URL 20, Heading 5, Body Text 12, Links 1, Other 4). So I position this in the PageRank ‘7’ balloon about midway between the 1 and 100 (score of 42).
STEP FOUR – WHO WINS?
Based on the two (or more) results you have place on the Bubble Diagram using Steps 1 to 3 above, the one furthest to the right wins (well not necessarily, there are at least four other issues).
In my example, Apple (PageRank 9 and Context Score 1) would have beaten Wikipedia (PageRank 7 and Context Score 40), whereas in actual fact Wikipedia wins. My chart is a little over-baked on the power of PageRank to make a point – the point is – if Apple spoke even a little about Steve Jobs on the home page (and left it there for a while) it would win in search – easily! PageRank is a very powerful driver of search placement and makes a page perform strongly on any context without the effort required to ‘strengthen context’ on lower PageRank web pages.
Some Other Issues
Score – PageRank is not really a score out of 10. If you thought about PageRank as a linear score resulting from inheritance from inward links, a page like Apple’s home page might rank in the billions. The ‘approximation’ used in the PageRank score that we see in the Google Toolbar (out of 10) is not linear. It might be better to think of it as the number of zeros at the end (PageRank 0 would be 0 to 10, PageRank 1 would be 11 to 100 and PageRank 9 would be 1,000,000,001 to 10,000,000,000). Our ‘simplification’ will only serve you as a guide. In addition, the score provided in the Google toolbar is only an approximate value.
Time – It takes time to inherit PageRank (new web pages do not have any to begin with). The Google bot (automated analysis by Google) only sweeps a site every so often and even less frequently for PageRank reassessment. Content must be on a Web Page for some considerable time for the search benefits of both context and PageRank to accrue. The amount of time depends on the initial PageRank, Google bot sweep frequency and other factors. Time is also a factor itself in search performance – longer-standing content will outperform new content unless your site is categorized for news.
Complexity – We said at the outset that this model is only a guide and that the process is very complex. In addition to the above elements, the type of web page has an impact, the parent domain and depth of the page within a website, the amount of similar content, the location of the website (allocated country) has a very significant impact, the global domain (.com .org .biz, etc.) as well as your connection to Google. If you run a Google Adwords campaign, your organic search placement tends to improve – perhaps this is because you are paying attention to SEO, perhaps it is because you have additional inward links (the Google Ad), perhaps it is because the Google Search Algorithm pays attention to good customers (certainly not something admitted to by Google). There are also damping factors, sand-boxes, black-lists and other items that are beyond the scope of this post.
Improving Your Search
The upshot of all of this is that if you value search and want additional visitors to your web page(s), you need to boost your PageRank. Ask related sites with good PageRank to link to you. Use link repositories like Digg, Delicious and DMOZ. Start Social Media to promote your site, not only will it drive traffic from these sources and allow you to build an online network, it also creates inward links that you control providing some additional PageRank. Perhaps most importantly, build high quality sites and content that people want to link to, and that encourage traffic, referral, social discussion, and do not prevent access by crawlers such as the Google bot. See the steps outlined in Ten Things That Matter For Search as well and address them for your most important web pages.
Location, Location, Location on the web means search and PageRank. There is no point building a public site with no public!