Global Sources
EE Times-India
Stay in touch with EE Times India
EE Times-India > Networks

Lessons learned from Google

Posted: 26 Dec 2008     Print Version  Bookmark and Share

Keywords:cloud computing  data centre  Internet  parallel programming 

Randy Katz, a professor of computer science at the University of California, Berkeley, worked on a sabbatical at Google, Inc. in the summer of 2006. He was setting up a new lab to study the impact of the rise of large data centres, and he wanted to get a closer look at the one on the other side of San Francisco Bay. Katz returned to Berkeley with a bagful of observations about the future of computing, programming and design.

Katz: A major building block they use is a data-intensive parallel programming paradigm called MapReduce, which they have applied very broadly across many things.

"I brought back not just insights into technological trends and programming skills that our students need to thrive in this new commercial environment, but also how we should organise our own activities for maximum collaboration and productivity, even in engineering research groups," he said.

For a computer scientist, the experience was like being a kid in a candy shop. The sweets included more than 100,000 networked computers running like a handful of super-sized machines.

"Researchers like me are lucky to have access to a few hundred or a thousand computers [but] here was Google two years ago organising computations across 100 times as many machines, and they have probably taken that to a factor of ten times more machines since 2006," Katz said.

"They can spread out the processing of things like Web search and advertising over multiple thousands of machines. A major building block they use is a data-intensive parallel programming paradigm called MapReduce. They have applied it very broadly across many of the things under the hood at Google."

Figure: Google algorithm splits or maps

a task into many jobs that run in parallel.

With MapReduce, Google technicians were able to take the data, partition it across a large number of machines and run algorithms on each piece simultaneously. Intermediate results are then transferred over the interconnect to a reduction step, which combines the results. "This is like a giant sort/merge type of application," Katz explained.

Googlers have not released the code behind MapReduce, but they have published papers on it. That has enabled IBM, Yahoo and others to develop an open source version called Hadoop now widely used at other big Internet data centres. Today, Hadoop powers a pioneering service at that has become the poster child of cloud computing, an approach many say could become the next big thing in computing.

1 • 2 • 3 Next Page Last Page

Comment on "Lessons learned from Google"
*  You can enter [0] more charecters.
*Verify code:


Visit Asia Webinars to learn about the latest in technology and get practical design tips.


Go to top             Connect on Facebook      Follow us on Twitter      Follow us on Orkut

Back to Top