Google Pagerank – An emprical analysis

Pagerank is an algorithm used by google to assign importance of each page in world wide web. The order of search results  depends on the page rank assigned to each page.  A web site has multiple pages and page rank assigned to each page in the web site. The page rank of the page depends on number of external links coming to the page and number of links going out of the page.  The page rank of an incoming link also plays a role on determining the page rank of the page.

Page rank is used to decide the sequence of search results for a google search. Companies want their page to be at the top of the search result. The easiest way to be at top of the search result is to pay google for the adwords. The sponsored web sites appear above and to the right hand side of its regular search results. 

Not all companies can afford to pay google for their Adwords. There is cost effective alternative which requires an understanding of how google works.  Google’s internal working knowledge will aid to define  analytical (or mathematical) strategies and implemented in the web site to improve the page rank and eventually with more visitors to the web site.

To validate the theory with emprical analysis, a plan with following steps drafted.

  1. Find a key word that is not in google’s index server
  2. Create a graph (random graph) and find the initial transition probability
  3. Evaluate the steady state of transition probability matrix using power methods – Which is the page rank of the graph (I will post the technical/mathematical details in a pdf. It is time-consuming to write matrix and other math notation in word press editor)
  4. Develop a set of web sites (pages) adhering to the random graph and each web site to contain the new key word
  5. Allow google crawler to include the new sites in their index server for the new key words
  6. Search for the new key word and absorb the order of search results
  7. Report the results

1. A new key word was selected and  is given below.   There is no google search result for the key word.

2. The random graph  (a representation of how web sites are linked to each other) and the graph will be implemented  with various blog post and  each blog post will have the key word  adhering to the graph. Each node of the graph denote a blog posting. The links connecting the nodes are the hyperlinks connecting the nodes.

Note: This page is used for google’s page rank emprical analysis. The links will be created based on the random graph created.  This is node #1 which has the key word:  xysivabodzinyx , xysivabodzinxy . As per the graph, it links out to page 2, page 4

2 thoughts on “Google Pagerank – An emprical analysis

  1. Pingback: Differentiating Technologies « Enterprise Architecture, IT Strategy & Others

  2. Pingback: MySQL – Enterprise readiness « Enterprise Architecture, IT Strategy & Others

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s