📜  Hadoop 和 SQL 的区别

📅  最后修改于: 2021-10-27 06:33:57             🧑  作者: Mango

Hadoop:它是一个将大数据存储在分布式系统中然后并行处理的框架。 Hadoop 的四个主要组件是 Hadoop 分布式文件系统 (HDFS)、Yarn、MapReduce 和库。它不仅涉及大数据,还涉及结构化、半结构化和非结构化信息的混合。亚马逊、IBM、微软、Cloudera、ScienceSoft、Pivotal、Hortonworks 是一些使用 Hadoop 技术的公司。

SQL:结构化查询语言是一种领域特定语言,用于计算和处理关系数据库管理系统中的数据管理,它还处理关系数据流管理系统中的数据流。简而言之,SQL 是一种标准的数据库语言,用于从 MySQL、Oracle、SQL Server 等关系数据库中创建、存储和提取数据。

下表列出了 Hadoop 和 SQL 之间的差异:

Feature Hadoop SQL
Technology Modern Traditional
Volume Usually in PetaBytes Usually in GigaBytes
Operations Storage, processing, retrieval and pattern extraction from data Storage, processing, retrieval and pattern mining of data
Fault Tolerance Hadoop is highly fault tolerant SQL has good fault tolerance
Storage Stores data in the form of key-value pairs, tables, hash map etc in distributed systems. Stores structured data in tabular format with fixed schema in cloud
Scaling Linear Non linear
Providers Cloudera, Horton work, AWS etc. provides Hadoop systems. Well-known industry leaders in SQL systems are Microsoft, SAP, Oracle etc.
Data Access Batch oriented data access Interactive and batch oriented data access
Cost It is open source and systems can be cost effectively scaled It is licensed and costs a fortune to buy a SQL server, moreover if system runs out of storage additional charges also emerge
Time Statements are executed very quickly SQL syntax is slow when executed in millions of rows
Optimization It stores data in HDFS and process though Map Reduce with huge optimization techniques. It does not have any advanced optimization techniques
Structure Dynamic schema, capable of storing and processing log data, real-time data, images, videos, sensor data etc.(both structured and unstructured) Static Schema, capable of storing data(fixed schema) in tabular format only(structured)
Data Update Write data once, read data multiple times Read and Write data multiple times
Integrity Low High
Interaction Hadoop uses JDBC(Java Database Connectivity) to communicate with SQL systems to send and receive data SQL systems can read and write data to Hadoop systems
Hardware Uses commodity hardware Uses propriety hardware
Training Learning Hadoop for entry-level as well as seasoned profession is moderately hard Learning SQL is easy for even entry-level professionals