Hadoop: Apache Hadoop 是一种软件编程框架,其中存储了大量数据并用于执行计算。它的框架基于Java编程,类似于 C 和 shell 脚本。换言之,我们可以说它是一个用于为集群系统下运行的各种大数据应用管理数据、存储数据和处理数据的平台。 Hadoop 的主要组件是 HDFS、Map Reduce 和 YARN。
MongoDB: MongoDB 是一个面向文档的跨平台数据库程序。它是一个 NoSQL 数据库程序,使用 JSON 文档(更具体地说是 Binary-JSON)和模式。 MongoDB Inc. 开发了 MongoDB,并已根据服务器端公共许可证(也称为 SSPL)获得许可。
下表列出了 Hadoop 和 MongoDB 之间的差异:
Based on | Hadoop | MongoDB |
---|---|---|
Fortmat of Data | It can be used with boyh structured or unstructured data | Uses only CSV or JSON format |
Design purpose | It is primarily designed as a database. | It is designed to analyze and process large volume of data. |
Built | It is a Java based application | It is a C++ based application |
Strength | Handling of batch processes and lengthy-running ETL jobs is excellently done in Hadoop. It comes very handy while for managing Big Data | It is more robust and flexible as compared to Hadoop |
Cost of Hardware | As it is a group of various software, it can cost more | Being a single product makes it cost effective |
Framework | It comprises of various software that is responsible for creating a data processing framework. | It can be used to query, aggregate, index or replicate data stored. The stored data is in the form of Binary JSON(BJSON) and storing of data is done in collections. |
RDBMS | It is not designed to replace a RDBMS system but provides addition support to RDBMS to archive data and also gives it a wide variety of use cases. | It is designed for the purpose of replacing or enhancing the RDBMS and giving it a wide variety of use cases. |
Drawbacks | Highly depends upon ‘NameNode’, that can be a point of failure | Low fault-tolerance that leads to loss of data occasionally |