Appendix C -- What Google's PageRanks Mean

1. Google's Official Description

Google has developed algorithms that automatically calculate the "importance" of any given Web page using the number of other pages on the Web that link to the page and the "importance" of those other pages. While the operational details of these calculations are a closely guarded trade secret, Google has provided a general description of its methodology on its Website at this URL:

http://www.google.com/technology/

The reader is strongly advised to review Google's official, albeit brief description. The remaining sections of this appendix illustrate how the general ideas described in Google's note might be implemented in highly simplified circumstances. Needless to say, it takes a mighty blast of creativity to leap from the trivial example described in the following paragraphs to the formulation and implementation of a set of algorithms powerful enough and efficient enough to make these calculations automatically for all of the billions of pages on the Web! ... :-)


2. An Illustrative Example

Imagine a well-attended conference whose main session features ten prominent panelists who are committed to providing cogent answers to questions posed about important issues by the conference attendees.

Furthermore, imagine that one of the panelists is the President of the United States, that eight other panelists are CEO's of Fortune 500 companies, and that the tenth panelist, Mr. Hector X, is apparently known only to the President.

In order to assure that conference attendees receive the most authoritative answers, the panel's efficient moderator, Ms. Mini Bot, asks each of the panelists to point out the two most important members of the group. She also informs them that they should feel free to point to themselves.

  • As Ms. Bot looks around the table, she is not surprised to see that all of the panelists are pointing to themselves with one hand, and that eight of them are pointing to the President with their other hand. She is somewhat surprised to see that two of the panelists are pointing to Mr. Steve J. However she is astounded to see that the President is pointing to the mysterious Mr. Hector X.

  • The votes received by each panelist appear in the second column of Table X (below). The table shows that the President received 8 votes, Mr. J. received 3 votes, Mr. X received 2 votes, and everyone else only received one vote.

  • Ms. Bot then asks each panelist to identify two topics they would be prepared to address. These topics appear in the last column of Table X. The Table shows that the top three vote-getters are prepared to discuss world trade, disasters, and iPods.

  • Although she has no idea who Mr. X is, Ms. Bot is astute enough to realize that the President's vote should be counted far more heavily than the single vote of any other panelist. The President cast two votes, so Ms. Bot decides to divide the President's vote count in half and award 4 votes to Mr. X, thereby giving him an adjusted total of 5 votes -- as shown in the third column of the table.

As a result of this adjustment, the panelists are now sorted into four ranks: the President holds the highest rank (4) with 8 votes; Mr .X holds the next rank (3) with 5 votes; and Mr. J. is next rank (2) with 3 votes. Everyone else received only one vote, so they are all relegated to the lowest rank (1).

Table X. Votes for Panelists

Name
Votes Received
Adjusted
Votes
Rank
Preferred Topics
(President) George W.
8
8
4
World trade, Disasters
Hector X.
2
1 + 4 = 5
3
Disasters, iPods
Steve J.
3
3
2
World trade, iPods
Others (7)
1 each
1 each
1
misc.

Ms. Bot now applies a Google-type strategy to determine who should answer each query and in what order. Her decisions are recorded in Table Y (below).

  • World Trade ==> The President (rank = 4) will speak first. Mr. X (rank = 3) is not prepared to discuss this topics, so Mr. J (rank = 2) will speak second.

  • Disasters ==> The President (rank = 4) will speak first. Mr. X (rank = 3) will speak second.

  • iPods ==> The President (rank = 4) is not prepared to discuss this topic, so the first speaker will be Mr. X (rank = 3). The second speaker on this topic will be Mr. J -- who will probably take very careful notes on whatever the mysterious Mr. X has to say ... :-)

Table Y. Order of Responses by Ranks to Queries from Audience

Topics
1st Speaker
2nd Speaker
World Trade
President (4)
Steve J. (2)
Disasters
President (4)
"Hecky-Boy" (3)
iPods
"Hecky-Boy" (3)
Steve J. (2)


3. Some Real PageRanks Assigned to Real Home Pages

Table Z (below) contains the same information about the Home Pages of some prominent organizations in the information technology sector, but it also contains numbers in parentheses that indicate the number of pages that link to each of the Home Pages. These numbers are returned by Google whenever it is given a "link" command.

  • For example, when given the command "link:www.google.com" on 4/15/06, Google returned 3,050,000 as the number of pages that linked to its own Home Page. There were 164,000 pages linked to the W3C's Home Page, and 76,600 pages linked to Apple's Home Page.

  • The sites in each rank are ordered from left to right according to the number of links they received.

  • The reader should note that the maximum number of links in each rank is larger than the maximum number in all lower ranks; likewise the minimum number of links in each rank is larger than the minimum number in lower ranks. In other words, pages with more links tend to be found in the higher ranks.

  • However there is considerable overlap in the ranges of links received by different ranks. For example, Apple is in rank 10 with 76,600 links which is considerably lower than Yahoo's 463,000, yet Yahoo is in rank 9. Presumably, many of the sites that linked to Apple had higher PageRanks than those that linked to Yahoo.

  • The PageRank received by any Web page may vary from one month to the next, rising or falling. A page that is ranked 9 one month may be ranked 8 the next. However, the fact that PageRanks represent Google's distillation of the "votes" a page receives from all of the other Websites in the world makes it unlikely that a page would rise two or more ranks within a few weeks or fall by two or more ranks. In other words, the changes are more likely to be fluctuations rather than sudden surges or plummets -- unless the world learns that something "very good" or "very bad" has happened to the organization that owns the Website. ... :-)

    All of the link numbers that appear in Table Z (below) were obtained during mid-March 2006 from Google's toolbar in browsers on workstations in the DLL's offices in Silver Spring, Maryland.

Table Z. PageRanks of Some Information Technology
Sector Home Pages -- With Link Counts

Silver Spring, Md -- 4/15/06

PageRanks
Organizations
10
Google (3,050,000), W3C (164,000), Apple (76,600)
9
Yahoo (463,000), MSN (252,000), Microsoft (154,000), Sun (95,700), IBM (67,000), Hewlett-Packard (45,700), AOL (34,100), Intel (31,900), Cisco (25,100), Oracle (17,300), Verisign (10,500), IETF (5,150)
8
Verizon (69,800), Linux (34,900), Dell (21,800), Symantec (14,900), Novell (11,500), Fujitsu (4,150), 3Com (3,280), Nortel (2,920), IANA (1,650), Blackboard (1,420)
7
Lenovo (22,300), Nintendo (5,740), Gateway (5,660), Toshiba (3,950), NCR (1,030), Desire2Learn (106)

Last updated: Monday 27-Mar-2006 12:18 PM