Hadoop Distributions

The following is a repost of my answer to a question on LinkedIn, but I thought it may prove useful to people evaluating Hadoop distributions.

The following is a substantially over simplified set of choices (in alphabetical order):

Amazon: Apache Hadoop provided as a web service. Good solution if your data is collected on Amazon…saves you the trouble of uploading gigs and gigs of data.

Apache: Apache Hadoop is the core code based upon which the various distributions are based.

Cloudera: CHD3 is based on Hadoop 1 (the current stable version) and CDH4 is based on Hadoop 2. CDH is based on Apache Hadoop. The only piece that’s not open source (AFAIK) is Cloudera Manager, which allows you to install up to 50 nodes for free before you go to the paid version. Cloudera is an extremely popular solution that runs on a wide variety of operating systems.

Hortonworks: HDP1 is 100% open source and is based on Hadoop 1. HDP is designed to run on RedHat/CentOS/Oracle Linux.

IBM: IBM BigInsights adds the GPFS filesystem to Hadoop, and is a good choice if your company already is an IBM shop…and you need to integrate with other IBM solutions. Free version is available as InfoSphere BigInsights Basic Edition. Basic Edition does not include all of the value add features found in Enterprise Edition (such as GPFS-SNC).

MapR: MapR uses a proprietary file system plus additional changes to Hadoop that addresses issues with the platform. They have a shared nothing architeture for the NameNode and JobTracker. MapR M3 is available for free, while M5 is a paid version with more features (such as the shared nothing NameNode). People who have used MapR tend to like it.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s