Fri . 19 Feb 2019

Google Data Centers

google data centers, google data centers in north america
Google data centers are the computer software and large hardware resources Google uses to provide their services This article describes the technological infrastructure behind Google's websites as presented in the company's public announcements

Contents

  • 1 Locations
  • 2 Hardware
    • 21 Original hardware
    • 22 Production hardware
    • 23 Network topology
    • 24 Project 02
    • 25 Summa papermill
    • 26 Modular container data centers
  • 3 Software
    • 31 Software development practices
  • 4 Search infrastructure
    • 41 Index
    • 42 Server types
  • 5 References
  • 6 Further reading
  • 7 External links

Locations

The locations of Google's various data centers are as follows:

United States:

  1. Berkeley County, South Carolina since 2007, expanded in 2013, 150 employment positions
  2. Council Bluffs, Iowa 41°13′177″N 95°51′4992″W / 41221583°N 958638667°W / 41221583; -958638667 announced 2007, first phase completed 2009, expanded 2013 and 2014, 130 employment positions
  3. Douglas County, Georgia 33°44′5904″N 84°35′533″W / 337497333°N 845848139°W / 337497333; -845848139 since 2003, 350 employment positions
  4. Jackson County, Alabama
  5. Lenoir, North Carolina 35°53′5478″N 81°32′5058″W / 358985500°N 815473833°W / 358985500; -815473833 announced 2007, completed 2009, over 110 employment positions
  6. Mayes County, Oklahoma
  7. Montgomery County, Tennessee
  8. Pryor Creek, Oklahoma at MidAmerica Industrial Park 36°14′281″N 95°19′4822″W / 36241139°N 953300611°W / 36241139; -953300611 announced 2007, expanded 2012, 100 employment positions
  9. The Dalles, Oregon 45°37′5704″N 121°12′816″W / 456325111°N 1212022667°W / 456325111; -1212022667 since 2006, 80 full-time employment positions

South America:

  • Quilicura, Chile announced 2012, online since 2015, up to 20 employment positions expected

Europe:

  • Saint-Ghislain, Belgium announced 2007, completed 2010, no job information available
  • Hamina, Finland 60°32′1168″N 27°7′121″E / 605365778°N 271170028°E / 605365778; 271170028 announced 2009, first phase completed 2011, expanded 2012, no job information available
  • Dublin, Ireland 53°19′1239″N 6°26′3143″W / 533201083°N 64420639°W / 533201083; -64420639 announced 2011, completed 2012, no job information available
  • Eemshaven, Netherlands 53°27′03″N 6°49′54″E / 53450939°N 6831570°E / 53450939; 6831570 announced 2014, completed 2016, 200 employment positions

Asia:

  • Jurong West, Singapore announced 2011, completed 2013, no job information available
  • Changhua County, Taiwan announced 2011, completed 2013, 60 employment positions

Hardware

Original hardware

The original hardware circa 1998 that was used by Google when it was located at Stanford University included:

  • Sun Microsystems Ultra II with dual 200 MHz processors, and 256 MB of RAM This was the main machine for the original Backrub system
  • 2 × 300 MHz dual Pentium II servers donated by Intel, they included 512 MB of RAM and 10 × 9 GB hard drives between the two It was on these that the main search ran
  • F50 IBM RS/6000 donated by IBM, included 4 processors, 512 MB of memory and 8 × 9 GB hard disk drives
  • Two additional boxes included 3 × 9 GB hard drives and 6 x 4 GB hard disk drives respectively the original storage for Backrub These were attached to the Sun Ultra II
  • SDD disk expansion box with another 8 × 9 GB hard disk drives donated by IBM
  • Homemade disk box which contained 10 × 9 GB SCSI hard disk drives

Production hardware

Google uses commodity-class x86 server computers running customized versions of Linux The goal is to purchase CPU generations that offer the best performance per dollar, not absolute performance How this is measured is unclear, but it is likely to incorporate running costs of the entire server, and CPU power consumption could be a significant factor Servers as of 2009–2010 consisted of custom-made open-top systems containing two processors each with several cores, a considerable amount of RAM spread over 8 DIMM slots housing double-height DIMMs, and at least two SATA hard disk drives connected through a non-standard ATX-sized power supply unit The servers were open top so more servers could be fit into a rack According to CNET and to a book by John Hennessy, each server had a novel 12-volt battery to reduce costs and improve power efficiency

According to Google their global data center operation electrical power ranges between 500 and 681 megawatts The combined processing power of these servers might have reached from 20 to 100 petaflops in 2008

Network topology

Details of the Google worldwide private networks are not publicly available but Google publications make references to the "Atlas Top 10" report that ranks Google as the third largest ISP behind Level 3

In order to run such a large network with direct connections to as many ISP as possible at the lowest possible cost Google has a very open peering policy

From this site we can see that the Google network can be accessed from 67 public exchange points and 69 different locations across the world As of May 2012 Google had 882 Gbit/s of public connectivity not counting private peering agreements that Google has with the largest ISPs This public network is used to distribute content to Google users as well as to crawl the Internet to build its search indexes

The private side of the network is a secret but recent disclosure from Google indicate that they use custom built high-radix switch-routers with a capacity of 128 × 10 Gigabit Ethernet port for the wide area network Running no less than two routers per datacenter for redundancy we can conclude that the Google network scales in the terabit per second range with two fully loaded routers the bi-sectional bandwidth amount to 1,280 Gbit/s

These custom switch-routers are connected to DWDM devices to interconnect data centers and point of presences PoP via dark fibre

From a datacenter view, the network starts at the rack level, where 19-inch racks are custom-made and contain 40 to 80 servers 20 to 40 1U servers on either side, while new servers are 2U rackmount systems Each rack has a switch Servers are connected via a 1 Gbit/s Ethernet link to the top of rack switch TOR TOR switches are then connected to a gigabit cluster switch using multiple gigabit or ten gigabit uplinks The cluster switches themselves are interconnected and form the datacenter interconnect fabric most likely using a dragonfly design rather than a classic butterfly or flattened butterfly layout

From an operation standpoint, when a client computer attempts to connect to Google, several DNS servers resolve wwwgooglecom into multiple IP addresses via Round Robin policy Furthermore, this acts as the first level of load balancing and directs the client to different Google clusters A Google cluster has thousands of servers and once the client has connected to the server additional load balancing is done to send the queries to the least loaded web server This makes Google one of the largest and most complex content delivery networks

Google has numerous data centers scattered around the world At least 12 significant Google data center installations are located in the United States The largest known centers are located in The Dalles, Oregon; Atlanta, Georgia; Reston, Virginia; Lenoir, North Carolina; and Moncks Corner, South Carolina In Europe, the largest known centers are in Eemshaven and Groningen in the Netherlands and Mons, Belgium Google's Oceania Data Center is claimed to be located in Sydney, Australia

Project 02

One of the largest Google data centers is located in the town of The Dalles, Oregon, on the Columbia River, approximately 80 miles from Portland Codenamed "Project 02", the $600 million complex was built in 2006 and is approximately the size of two American football fields, with cooling towers four stories high The site was chosen to take advantage of inexpensive hydroelectric power, and to tap into the region's large surplus of fiber optic cable, a remnant of the dot-com boom A blueprint of the site appeared in 2008

Summa papermill

In February 2009, Stora Enso announced that they had sold the Summa paper mill in Hamina, Finland to Google for 40 million Euros Google plans to invest 200 million euros on the site to build a data center Google chose this location due to the availability and proximity of renewable energy sources

Modular container data centers

In 2005, Google was researching a containerized modular data center Google filed a patent application for this technology in 2003

Software

Most of the software stack that Google uses on their servers was developed in-house According to a well-known Google employee, C++, Java, Python and more recently Go are favored over other programming languages For example, the back end of Gmail is written in Java and the back end of Google Search is written in C++ Google has acknowledged that Python has played an important role from the beginning, and that it continues to do so as the system grows and evolves

The software that runs the Google infrastructure includes:

  • Google Web Server GWS – custom Linux-based Web server that Google uses for its online services
  • Storage systems:
    • Google File System and its successor, Colossus
    • BigTable – structured storage built upon GFS/Colossus
    • Spanner – planet-scale structured storage system, next generation of BigTable stack
    • Google F1 – a distributed, quasi-SQL DBMS based on Spanner, substituting a custom version of MySQL
  • Chubby lock service
  • MapReduce and Sawzall programming language
  • Indexing/search systems:
    • TeraGoogle – Google's large search index launched in early 2006, designed by Anna Patterson of Cuil fame
    • Caffeine Percolator – continuous indexing system launched in 2010
    • Hummingbird – major search index update, including complex search and voice search
  • Borg declarative process scheduling software

Google has developed several abstractions which it uses for storing most of its data:

  • Protocol Buffers – "Google's lingua franca for data", a binary serialization format which is widely used within the company
  • SSTable Sorted Strings Table – a persistent, ordered, immutable map from keys to values, where both keys and values are arbitrary byte strings It is also used as one of the building blocks of BigTable
  • RecordIO – a sequence of variable sized records

Software development practices

Most operations are read-only When an update is required, queries are redirected to other servers, so as to simplify consistency issues Queries are divided into sub-queries, where those sub-queries may be sent to different ducts in parallel, thus reducing the latency time

To lessen the effects of unavoidable hardware failure, software is designed to be fault tolerant Thus, when a system goes down, data is still available on other servers, which increases reliability

Search infrastructure

Index

Like most search engines, Google indexes documents by building a data structure known as inverted index Such an index obtains a list of documents by a query word The index is very large due to the number of documents stored in the servers

The index is partitioned by document IDs into many pieces called shards Each shard is replicated onto multiple servers Initially, the index was being served from hard disk drives, as is done in traditional information retrieval IR systems Google dealt with the increasing query volume by increasing number of replicas of each shard and thus increasing number of servers Soon they found that they had enough servers to keep a copy of the whole index in main memory although with low replication or no replication at all, and in early 2001 Google switched to an in-memory index system This switch "radically changed many design parameters" of their search system, and allowed for a significant increase in throughput and a large decrease in latency of queries

In June 2010, Google rolled out a next-generation indexing and serving system called "Caffeine" which can continuously crawl and update the search index Previously, Google updated its search index in batches using a series of MapReduce jobs The index was separated into several layers, some of which were updated faster than the others, and the main layer wouldn't be updated for as long as two weeks With Caffeine the entire index is updated incrementally on a continuous basis Later Google revealed a distributed data processing system called "Percolator" which is said to be the basis of Caffeine indexing system

Server types

Google's server infrastructure is divided into several types, each assigned to a different purpose:

  • Web servers coordinate the execution of queries sent by users, then format the result into an HTML page The execution consists of sending queries to index servers, merging the results, computing their rank, retrieving a summary for each hit using the document server, asking for suggestions from the spelling servers, and finally getting a list of advertisements from the ad server
  • Data-gathering servers are permanently dedicated to spidering the Web Google's web crawler is known as GoogleBot They update the index and document databases and apply Google's algorithms to assign ranks to pages
  • Each index server contains a set of index shards They return a list of document IDs "docid", such that documents corresponding to a certain docid contain the query word These servers need less disk space, but suffer the greatest CPU workload
  • Document servers store documents Each document is stored on dozens of document servers When performing a search, a document server returns a summary for the document based on query words They can also fetch the complete document when asked These servers need more disk space
  • Ad servers manage advertisements offered by services like AdWords and AdSense
  • Spelling servers make suggestions about the spelling of queries

References

  1. ^ "Google data centers, locations" Google Retrieved 21 July 2014 
  2. ^ https://wwwgooglecom/about/datacenters/inside/locations/jackson-county/
  3. ^ "Google Stanford Hardware" at the Wayback Machine archived February 9, 1999 Stanford University provided by Internet Archive Retrieved on July 10, 2006
  4. ^ Tawfik Jelassi; Albrecht Enders 2004 "Case study 16 — Google" Strategies for E-business Pearson Education p 424 ISBN 978-0-273-68840-2 
  5. ^ a b Computer Architecture, Fifth Edition: A Quantitative Approach, ISBN 978-0123838728; Chapter Six; 67 "A Google Warehouse-Scale Computer" page 471 "Designing motherboards that only need a single 12-volt supply so that the UPS function could be supplied by standard batteries associated with each server"
  6. ^ Google's secret power supplies on YouTube
  7. ^ Google on-server 12V UPS, 1 April 2009
  8. ^ Google Green infographics
  9. ^ Analytics Press Growth in data center electricity use 2005 to 2010
  10. ^ Google Surpasses Supercomputer Community, Unnoticed, May 20, 2008
  11. ^ "Fiber Optic Communication Technologies: What's Needed for Datacenter Network Operations", Research, Google 
  12. ^ "FTTH look ahead — technologies & architectures", Research, Google 
  13. ^ James Pearn How many servers does Google have plusgooglecom 
  14. ^ "kumara ASN15169", Peering DB 
  15. ^ "Urs Holzle", Speakers, Open Network Summit 
  16. ^ a b c Web Search for a Planet: The Google Cluster Architecture Luiz André Barroso, Jeffrey Dean, Urs Hölzle
  17. ^ Warehouse size computers
  18. ^ Denis Abt High Performance Datacenter Networks: Architectures, Algorithms, and Opportunities
  19. ^ a b c Fiach Reid 2004 "Case Study: The Google search engine" Network Programming in NET Digital Press pp 251–253 ISBN 978-1-55558-315-6 
  20. ^ a b Rich Miller March 27, 2008 "Google Data Center FAQ" Data Center Knowledge Retrieved 2009-03-15 
  21. ^ Brett Winterford March 5, 2010 "Found: Google Australia's secret data network" ITNews Retrieved 2010-03-20 
  22. ^ Google "The Dalles, Oregon Data Center" Retrieved on January 3, 2011
  23. ^ Markoff, John; Hansell, Saul "Hiding in Plain Sight, Google Seeks More Power" New York Times June 14, 2006 Retrieved on October 15, 2008
  24. ^ Strand, Ginger "Google Data Center" Harper's Magazine March 2008 Retrieved on October 15, 2008 Archived August 30, 2012, at the Wayback Machine
  25. ^ "Stora Enso divests Summa Mill premises in Finland for EUR 40 million" Stora Enso 2009-02-12 Retrieved 12022009  Check date values in: |access-date= help
  26. ^ "Stooora yllätys: Google ostaa Summan tehtaan" Kauppalehti in Finnish Helsinki 2009-02-12 Retrieved 2009-02-12 
  27. ^ "Google investoi 200 miljoonaa euroa Haminaan" Taloussanomat in Finnish Helsinki 2009-02-04 Retrieved 2009-03-15 
  28. ^ Finland – First Choice for Siting Your Cloud Computing Data Center Accessed 4 August 2010
  29. ^ http://wwwtheregistercouk/2009/04/10/google_data_center_video
  30. ^ "United States Patent: 7278273" Patftusptogov Retrieved 2012-02-17 
  31. ^ Mark Levene 2005 An Introduction to Search Engines and Web Navigation Pearson Education p 73 ISBN 978-0-321-30677-7 
  32. ^ "Python Status Update" Artima 2006-01-10 Retrieved 2012-02-17 
  33. ^ "Warning" Panela Blog-city Archived from the original on December 28, 2011 Retrieved 2012-02-17 
  34. ^ "Quotes about Python" Python Retrieved 2012-02-17 
  35. ^ "Google Architecture" High Scalability 2008-11-22 Retrieved 2012-02-17 
  36. ^ a b c Fikes, Andrew July 29, 2010, "Storage Architecture and Challenges", TechTalk PDF, Google 
  37. ^ "Colossus: Successor to the Google File System GFS" SysTutorials 2012-11-29 Retrieved 2016-05-10 
  38. ^ Dean, Jeffrey 'Jeff' 2009, "Design, Lessons and Advice from Building Large Distributed Systems", Ladis keynote talk presentation, Cornell 
  39. ^ Shute, Jeffrey 'Jeff'; Oancea, Mircea; Ellner, Stephan; Handy, Benjamin 'Ben'; Rollins, Eric; Samwel, Bart; Vingralek, Radek; Whipkey, Chad; Chen, Xin; Jegerlehner, Beat; Littlefield, Kyle; Tong, Phoenix 2012, "F1 — the Fault-Tolerant Distributed RDBMS Supporting Google's Ad Business", Research presentation, Sigmod: Google 
  40. ^ "Anna Patterson – CrunchBase Profile" Crunchbasecom Retrieved 2012-02-17 
  41. ^ a b The Register Google Caffeine jolts worldwide search machine
  42. ^ "Google official release note" Googlecom Retrieved 2013-09-28 
  43. ^ a b "Google Developing Caffeine Storage System | TechWeekEurope UK" Eweekeuropecouk 2009-08-18 Retrieved 2012-02-17 
  44. ^ "Developer Guide – Protocol Buffers – Google Code" Codegooglecom Retrieved 2012-02-17 
  45. ^
  46. ^ windley on June 24, 2008 1:10 PM 2008-06-24 "Phil Windley's Technometria | Velocity 08: Storage at Scale" Windleycom Retrieved 2012-02-17 
  47. ^ "Message limit – Protocol Buffers | Google Groups" Groupsgooglecom Retrieved 2012-02-17 
  48. ^ "Jeff Dean's keynote at WSDM 2009" PDF Retrieved 2012-02-17 
  49. ^ Daniel Peng, Frank Dabek 2010 Large-scale Incremental Processing Using Distributed Transactions and Notifications Proceedings of the 9th USENIX Symposium on Operating Systems Design and Implementation
  50. ^ The Register Google Percolator – global search jolt sans MapReduce comedown
  51. ^ Chandler Evans 2008 "Google Platform" Future of Google Earth Madison Publishing Company p 299 ISBN 978-1-4196-8903-1 
  52. ^ Chris Sherman 2005 "How Google Works" Google Power McGraw-Hill Professional pp 10–11 ISBN 978-0-07-225787-8 
  53. ^ Michael Miller 2007 "How Google Works" Googlepedia Pearson Technology Group pp 17–18 ISBN 978-0-7897-3639-0 

Further reading

  • LA Barroso; J Dean; U Hölzle March–April 2002 "Web search for a planet: The Google cluster architecture" PDF IEEE Micro 23 2: 22–28 doi:101109/MM20031196112 
  • Shankland, Stephen, CNET news "Google uncloaks once-secret server" April 1, 2009

External links

  • Google Research Publications
  • Web Search for a Planet: The Google Cluster Architecture Luiz André Barroso, Jeffrey Dean, Urs Hölzle
  • Underneath the Covers at Google: Current Systems and Future Directions Talk given by Jeff Dean at Google I/O conference in May 2008
  • Search Engine Optimization

[ Search Engine Optimization

google data centers, google data centers in north america, google data centers jobs, google data centers locations, google data centers map, google data centers megawatts, google data centers pics, google data centers power distribution, google data centers water usage


Google Data Centers Information about

Google Data Centers


  • user icon

    Google Data Centers beatiful post thanks!

    29.10.2014


Google Data Centers
Google Data Centers
Google Data Centers viewing the topic.
Google Data Centers what, Google Data Centers who, Google Data Centers explanation

There are excerpts from wikipedia on this article and video

Random Posts

IP address blocking

IP address blocking

IP address blocking prevents connection between a server or website and certain IP addresses or rang...
Gisele Bündchen

Gisele Bündchen

Gisele Caroline Bündchen1 Portuguese pronunciation: ʒiˈzɛli kaɾoˈlini ˈbĩtʃẽj, German pronuncia...
Sheldon, West Midlands

Sheldon, West Midlands

Sheldon is an area of east Birmingham, England Historically part of Warwickshire, it is close to the...
Beverly, Chicago

Beverly, Chicago

Beverly is one of the 77 community areas of Chicago, Illinois It is located on the South Side on the...