Tag Archives: how many servers does google have

The World’s Most Voracious Readers

atkins-bookshelf-triviaAs of January 2014, the size of the Internet, that is indexed, consists of more than 2.04 billion pages. As of 2012, Google — the world’s  largest index of the Internet — has only indexed a portion of that: approximately 50 billion pages (about 5 million terabytes of data store on more than 900,000 servers). If each web page were a sheet of paper, a print version of Google’s indexed website would be more than 3,156 feet high, slightly taller than El Capitan, the iconic granite monolith in Yosemite National Park in California — now that’s what you call a truly monumental book. So who reads all these websites? Who are the Internet’s most voracious readers?

Most individuals will be surprised to learn that humans do not make up the majority of the readers of the Internet. A study published in December 2013 by Incapsula, an Internet services company, found that only 38.5% of Internet traffic was generated by humans, while 61.5% of Internet users were non-humans; specifically, search engines, bots (or web crawlers), scrapers, hacking tools, spammers, and other impersonators. And perhaps the most disturbing discovery was that 60% of the non-human Internet use was malicious. A year earlier, in 2012, Internet use by humans was 49% and use by bots was 51%. Sadly, bots are the most voracious (and malicious) readers of the Internet, like a horde of vandals, armed with spray paint, scissors, and butane lighters, that are released into the Library of Congress to wreak havoc. What will survive?

Below is the breakdown of human and bot traffic on the Internet:
38.5% Human traffic

61.5% Non-human traffic:
31% Search engines
5% Scrapers
4.5% Hacking tools
0.5% Spammers
20.5% Other impersonators

Read related posts: Types of Book Readers
How Many Books Does the Average American Read?
Most Expensive Books Ever Sold
How Many People Read Books?

For further reading: incapsula.com/blog/bot-traffic-report-2013.html
statisticbrain.com/total-number-of-pages-indexed-by-google
wisegeek.org/how-big-is-the-internet.htm


How a Google Search Works

Each day, Google’s one million servers, located all over the world, process over one billion search requests, providing more than 7.2 billion page views. The searches come from internet users in more than 180, with queries written in almost 150 different languages; 15% of the queries are entirely new. So what happens when you do a Google search?

1. After a web user types in a question or keywords and hits send, the query travels in packets (each packet, containing a header and a footer, can carry up to 1,500 bytes) amazingly fast — almost the speed of light (186,000 miles per second) to the Google web servers. (Note the sluggish human body is by comparison, where nerve impulses travel at speeds from 2 mph up to 200 miles per hour).

2. The Google web server is a network of more than 3 million computer servers, located in highly protected data centers, that are linked together to form “one brain.”

3. The query is then forwarded to 1 million index servers. When a user does a Google search, he or she is not really searching the entire internet, but rather Google’s extensive index. The Google index is created by internet bots, known as web crawlers of web spiders, that browse the entire internet, visiting each website and following each of its links. As of 2012, Google has indexed more than 50 billion web pages.

4. The query then travels to Google’s document servers that actually do the retrieving of the stored relevant web pages. A proprietary ranking algorithm, known as PageRank (named after founder Larry Page), then evaluates all the data (processing more than 50 million variables and more than 2 billion terms) to arrive at the most relevant pages. In simpler terms, PageRank evaluates how many outside links point to a particular web page, and evaluates how important those links are in order to display the most visited and best known web pages.

5. The Google servers then generate snippets, or short abstracts, that describe or summarize each of the relevant web pages.

6. The information is then delivered to the internet user’s computer.

Elapsed time for all this evaluation and millions of calculations: an astounding .5 of a second!

Related article: What is the most googled word?

For further reading: The Human Face of Big Data by Rick Smolan and Jennifer Erwitt, Against All Odds Productions (2012)
http://computer.howstuffworks.com/ip-convergence2.htm
http://www.statisticbrain.com/google-searches/
http://hypertextbook.com/facts/2002/DavidParizh.shtml


%d bloggers like this: