Hadoop

Hadoop is an Apache Software Foundation project. It is a powerful open source software platform that addresses two constant requirements of Big Data management- inexpensive, reliable storage; and new tools for analyzing unstructured and structured data. Hadoop software framework helps manage big data easily and inexpensively.

The Hadoop framework allows for the distributed processing of large data sets across clusters of computers using a simple programming model.

It is designed to run on a large number of machines that don’t share any memory or disks, that is, it scales up from single servers to thousands of machines, each having a local computation and storage.

When data is uploaded into Hadoop, the software disperses that data into pieces that it then spreads across the different servers on the network. Hadoop keeps track of where the data resides. This enables multiple copy stores and data stored on a server that goes offline can be automatically replicated from a known good copy that Hadoop is already keeping a track of.

Hadoop is used by Yahoo, eBay, LinkedIn and Facebook.