猪和Hive的区别 - 芒果文档

1. 猪：
Pig用于大量数据的分析。它是对 MapReduce 的抽象。 Pig 用于在 Hadoop 中执行各种数据操作操作。它提供了 Pig-Latin 语言来编写包含许多内置功能的代码，例如 join、filter 等。 Apache Pig 的两个部分是 Pig-Latin 和 Pig-Engine。 Pig Engine 用于将所有这些脚本转换为特定的地图并减少任务。 Pig 抽象处于更高级别。与 MapReduce 相比，它包含的代码行更少。

2.Hive：
Hive构建在 Hadoop 之上，用于处理 Hadoop 中的结构化数据。 Hive是由 Facebook 开发的。它提供了各种类型的查询语言，通常称为Hive查询语言。 Apache Hive是一个数据仓库，它在用户和集成了 Hadoop 的 Hadoop 分布式文件系统 (HDFS) 之间提供类似 SQL 的接口。

Pig 和Hive 的区别：

S.No.	Pig	Hive
1.	Pig operates on the client side of a cluster.	Hive operates on the server side of a cluster.
2.	Pig uses pig-latin language.	Hive uses HiveQL language.
3.	Pig is a Procedural Data Flow Language.	Hive is a Declarative SQLish Language.
4.	It was developed by Yahoo.	It was developed by Facebook.
5.	It is used by Researchers and Programmers.	It is mainly used by Data Analysts.
6.	It is used to handle structured and semi-structured data.	It is mainly used to handle structured data.
7.	It is used for programming.	It is used for creating reports.
8.	Pig scripts end with .pig extension.	In HIve, all extensions are supported.
9.	It does not support partitioning.	It supports partitioning.
10.	It loads data quickly.	It loads data slowly.
11.	It does not support JDBC.	It supports JDBC.
12.	It does not support ODBC.	It supports ODBC.
13.	Pig does not have a dedicated metadata database.	Hive makes use of the exact variation of dedicated SQL-DDL language by defining tables beforehand.
14.	It supports Avro file format.	It does not support Avro file format.
15.	Pig is suitable for complex and nested data structures.	Hive is suitable for batch-processing OLAP systems.
16.	Pig does not support schema to store data.	Hive supports schema for data insertion in tables.