Modern Web Technologies: Difference between the Hadoop HDFS and Google GFS

Friday, June 3, 2016

Difference between the Hadoop HDFS and Google GFS

In Hadoop, the reducer is presented with a key and an iterator over all values associated with the particular key. The values are arbitrarily ordered.

Google's implementation allows the programmer to specify a secondary sort key for ordering the values (if desired) in which case values associated with each key would be presented to the developer's reduce code in sorted order.

Some of the key differences are listed below:

HDFS

Initiated by yahoo and later on made open source
Developed in Java
it consists of NameNode and DataNode
default block size is 128 mb
it follows WORM model : write once read multiple
deleted files are renamed into particular folder and are removed by the garbage
reducer can emit arbitrary number of key value

GFS

Developed by Google Inc.
Developed in C++(most likely)
it consists of MasterNode and Chunk Server
default block size is 64 mb
multiple read and write is possible
deleted nodes are hide and are removed if not used more then 3 days
programmer are not allowed to change the key in the reducer i.e. reducer input key must be the same as in the output

Modern Web Technologies

Friday, June 3, 2016

Difference between the Hadoop HDFS and Google GFS

No comments:

Post a Comment