Yahoo Improves Apache Hadoop for Cloud
Yahoo has released an improved distribution of Apache Hadoop, an open source project that lets users process massive amounts of data. Yahoo Distribution of Hadoop supports many mission-critical apps at Yahoo, including Yahoo Search, Yahoo Mail and various content and advertising services.
Responding to requests from the Apache Hadoop community, Yahoo released its own improved distribution of Apache Hadoop, an open source project that lets users process massive amounts of data.
Yahoo said Hadoop now supports many Yahoo properties including Yahoo Search, processing data for billions of Web search queries that run on Yahoo every month, as well as Yahoo Mail and various content and advertising services.
Apache Hadoop is a free Java software framework. The Yahoo distribution, includes all the original Apache Hadoop source code and adds code patches to improve the stability and performance of its clusters. Yahoo said these patches have already been contributed back to Apache, although they may not yet be available in an Apache release of Hadoop.
Notably, Yahoo officials said its Yahoo Distribution of Hadoop was tested and is deployed at Yahoo on the largest Hadoop clusters in the world.
"We know from our own experience serving half a billion users worldwide requires large-scale distributed systems, and a growing number of other companies and organizations are in need of similar capabilities," said Shelton Shugar, senior vice president of cloud computing at Yahoo, in a statement.
He added that Hadoop enables Yahoo key software infrastructure running on tens of thousands of machines to process data critical to Yahoo's core business. "By making the Yahoo Distribution of Hadoop generally available, we are contributing back to the Apache Hadoop community so that the ecosystem can benefit from Yahoo's quality and scale investments," Shugar said.
Yahoo said it has led the way in developing and is now the primary contributor to Apache Hadoop. Hadoop founder Doug Cutting joined Yahoo in 2006. At that time the company said it began investing in developing Hadoop.