1. 猪:
Pig用于大量数据的分析。它是对 MapReduce 的抽象。 Pig 用于在 Hadoop 中执行各种数据操作操作。它提供了 Pig-Latin 语言来编写包含许多内置功能的代码,例如 join、filter 等。 Apache Pig 的两个部分是 Pig-Latin 和 Pig-Engine。 Pig Engine 用于将所有这些脚本转换为特定的地图并减少任务。 Pig 抽象处于更高级别。与 MapReduce 相比,它包含的代码行更少。
2.Hive:
Hive构建在 Hadoop 之上,用于处理 Hadoop 中的结构化数据。 Hive是由 Facebook 开发的。它提供了各种类型的查询语言,通常称为Hive查询语言。 Apache Hive是一个数据仓库,它在用户和集成了 Hadoop 的 Hadoop 分布式文件系统 (HDFS) 之间提供类似 SQL 的接口。
Pig 和Hive 的区别:
S.No. | Pig | Hive |
---|---|---|
1. | Pig operates on the client side of a cluster. | Hive operates on the server side of a cluster. |
2. | Pig uses pig-latin language. | Hive uses HiveQL language. |
3. | Pig is a Procedural Data Flow Language. | Hive is a Declarative SQLish Language. |
4. | It was developed by Yahoo. | It was developed by Facebook. |
5. | It is used by Researchers and Programmers. | It is mainly used by Data Analysts. |
6. | It is used to handle structured and semi-structured data. | It is mainly used to handle structured data. |
7. | It is used for programming. | It is used for creating reports. |
8. | Pig scripts end with .pig extension. | In HIve, all extensions are supported. |
9. | It does not support partitioning. | It supports partitioning. |
10. | It loads data quickly. | It loads data slowly. |
11. | It does not support JDBC. | It supports JDBC. |
12. | It does not support ODBC. | It supports ODBC. |
13. | Pig does not have a dedicated metadata database. | Hive makes use of the exact variation of dedicated SQL-DDL language by defining tables beforehand. |
14. | It supports Avro file format. | It does not support Avro file format. |
15. | Pig is suitable for complex and nested data structures. | Hive is suitable for batch-processing OLAP systems. |
16. | Pig does not support schema to store data. | Hive supports schema for data insertion in tables. |