Friday, June 3, 2016

Difference between the Hadoop HDFS and Google GFS

In Hadoop, the reducer is presented with a key and an iterator over all values associated with the particular key. The values are arbitrarily ordered.

Google's implementation allows the programmer to specify a secondary sort key for ordering the values (if desired) in which case values associated with each key would be presented to the developer's reduce code in sorted order.

Some of the key differences are listed below:

HDFS
  • Initiated by yahoo and later on made open source
  • Developed in Java
  • it consists of NameNode and DataNode
  • default block size is 128 mb
  • it follows WORM model : write once read multiple
  • deleted files are renamed into particular folder and are removed by the garbage
  • reducer can emit arbitrary number of key value


GFS
  • Developed by Google Inc.
  • Developed in C++(most likely)
  • it consists of MasterNode and Chunk Server
  • default block size is 64 mb
  • multiple read and write is possible
  • deleted nodes are hide and are removed if not used more then 3 days
  • programmer are not allowed to change the key in the reducer i.e. reducer input key must be the same as in the output


No comments:

Post a Comment