Hadoop是一个建立在机器集群上的开源软件框架。它用于非常大的数据集(即大数据)的分布式存储和分布式处理。它是使用 Map-Reduce 编程模型完成的。用Java实现的开发友好型工具支持大数据应用程序。它可以轻松处理商品服务器集群上的海量数据。它可以挖掘任何形式的数据,即结构化、非结构化或半结构化数据。它是高度可扩展的。它由 3 个组件组成:
- HDFS:可靠的存储系统,其中存储了世界上一半的数据。
- Map Reduce :该层由分布式处理器组成。
- Yarn :该层由资源管理器组成。
Amazon RedShift是一种基于云的大规模数据仓库服务。 Amazon Redshift 拥有商业许可证,并且是 Amazon Web 服务的一部分。它处理大量数据并以其可扩展性而闻名。它并行处理多个数据。它使用 ACID 特性作为其工作原理,非常受欢迎。用C语言实现,可用性高。 Amazon Redshift 的功能 – 快速、简单、经济高效的数据仓库服务。
下表列出了Apace Hadoop 与 Amazon Redshift之间的差异:
APACHE HADOOP |
AMAZON REDSHIFT |
Hadoop is 10 times costlier than Redshift. It costs about $200 per month. | It is cheaper than Hadoop and costs $20 per month as the price depends on the region of the server. |
Map Reduce jobs are slower in Hadoop. | Redshift performs much faster than Hadoop cluster. For example: Redshift 16 node cluster performed a lot faster than a Hive/Elastic Map Reduce 44 node cluster. |
Hadoop has a storage layer and stores data as files without taking into account any underlying data structure. | Redshift is a columnar database which is designed to work with complex queries spanning millions of rows. Data is arranged in a table format and supports the structures based on PostgreSQL standard. |
Use the HDFS set and get shell command to copy data to the Hadoop cluster. | Data in Redshift are copied firstly by using Amazon S3 and then by copy command. |
Scaling is not a limiting factor in Hadoop as one can scale to any amount of storage space by managing and integrating the nodes process properly. | Redshift can only scale up to 2 PB. |
Slower in comparison to Redshift. | Ten times faster than Hadoop. |
Hadoop is a Open Source Framework by Apache Projects. | Red Shift is a priced Services provided by Amazon. |
Hadoop is more flexible with local file system and any database | Redshift can only load data from Amazon S3 or DynamoDB. |
Administrative activities are complex and trickier to handle in Hadoop. | Redshift has automated backups to Amazon S3 and data warehouse administration. |
It is provided by Hortonworks and Cloudera providers etc., | It is developed and provided by Amazon Web services. |