Overview Of Apache Hadoop Job Support:
The Apache Hadoop is an open source software framework for the storage & large scale processing of the data-sets on clusters of commodity hardware. The Hadoop is an Apache top-level project being built & used by an global community of contributors & users. It is licensed under the Apache License 2.0.
The Apache Hadoop was born out of an need to process an avalanche of the big data. The web was generating more & more information on an daily basis, & it was becoming very difficult to index over one billion pages of the content. In order to cope, the Google invented an new style of data processing known as the MapReduce.
A year after Google published an white paper describing the MapReduce framework, Doug Cutting & Mike Cafarella, inspired by the white paper, created Hadoop to apply these concepts to the open-source software framework to support distribution for the Nutch search engine project. Given the original case, so the Hadoop was designed with an simple write-once storage infrastructure.
The Apache Hadoop software library can be detect & handle failures at the application layer, so it can deliver an highly-available service on the top of an cluster of the computers, each of which may be prone to the failures