The Yahoo! Distribution of Hadoop is tested and deployed on Yahoo!’s clusters, which are the largest Hadoop clusters in the world. The Yahoo! Distribution of Hadoop is a source distribution that is based entirely on code found in the Apache Hadoop project.
Hadoop is a free Java software framework that supports data intensive distributed applications. It enables applications to work with thousands of nodes and petabytes of data. Hadoop was inspired by Google’s MapReduce and Google File System (GFS) papers.
A wide variety of companies and organizations use Hadoop for both research and production. Users are encouraged to add themselves to the Hadoop users wiki page.
Amazon announced in April the beta release of a new service called Amazon Elastic MapReduce which they describe as “a web service that enables businesses, researchers, data analysts, and developers to easily and cost-effectively process vast amounts of data. It utilizes a hosted Hadoop framework running on the web-scale infrastructure of Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Simple Storage Service (Amazon S3).
I don’t think you can run an Hadoop equivalent on Windows Azure, although I hope Microsoft will provide some sort of Map Reduce implementation native to Windows Azure.
Here’s the GitHub repository with the source code for Hadoop.
Related posts:
- Amazon Announces Amazon Elastic Map Reduce Amazon announced today the public beta of Amazon Elastic MapReduce,...
- New Amazon AWS SDK for .NET Developers Released Under the pressure from Windows Azure release in a week,...
- The best Windows Azure RSS feed This is the RSS feed you need to subscribe to...












