📜  Pig Latin概念

📅  最后修改于: 2020-12-03 08:58:07             🧑  作者: Mango

Pig Latin

Pig Latin是Apache Pig用于分析Hadoop中数据的数据流语言。它是一种文本语言,将Java MapReduce惯用语中的编程抽象为一种表示法。

Pig Latin语句

Pig Latin语句用于处理数据。它是一个接受一个关系作为输入并生成另一个关系作为输出的运算符。

  • 它可以跨越多行。
  • 每个语句必须以分号结尾。
  • 它可能包括表达式和模式。
  • 默认情况下,使用多查询执行来处理这些语句

Pig拉丁约定

Convention Description
( ) The parenthesis can enclose one or more items. It can also be used to indicate the tuple data type.
Example – (10, xyz, (3,6,9))
[ ] The straight brackets can enclose one or more items. It can also be used to indicate the map data type.
Example – [INNER | OUTER]
{ } The curly brackets enclose two or more items. It can also be used to indicate the bag data type
Example – { block | nested_block }
The horizontal ellipsis points indicate that you can repeat a portion of the code.
Example – cat path [path …]

拉丁数据类型

简单数据类型

Type Description
int It defines the signed 32-bit integer.
Example – 2
long It defines the signed 64-bit integer.
Example – 2L or 2l
float It defines 32-bit floating point number.
Example – 2.5F or 2.5f or 2.5e2f or 2.5.E2F
double It defines 64-bit floating point number.
Example – 2.5 or 2.5 or 2.5e2f or 2.5.E2F
chararray It defines character array in Unicode UTF-8 format.
Example – javatpoint
bytearray It defines the byte array.
boolean It defines the boolean type values.
Example – true/false
datetime It defines the values in datetime order.
Example – 1970-01- 01T00:00:00.000+00:00
biginteger It defines Java BigInteger values.
Example – 5000000000000
bigdecimal It defines Java BigDecimal values.
Example – 52.232344535345

复杂类型

Type Description
tuple It defines an ordered set of fields.
Example – (15,12)
bag It defines a collection of tuples.
Example – {(15,12), (12,15)}
map It defines a set of key-value pairs.
Example – [open#apache]