📜  PySpark – 从 DataFrame 中提取单个值

📅  最后修改于: 2022-05-13 01:54:32.738000             🧑  作者: Mango

PySpark – 从 DataFrame 中提取单个值

在本文中,我们将从 pyspark 数据框列中提取单个值。为此,我们将使用 first() 和 head() 函数。

单值意味着只有一个值,我们可以根据列名提取这个值

所以我们将使用嵌套列表创建数据框。



Python3
# importing module
import pyspark
  
# importing sparksession from pyspark.sql module
from pyspark.sql import SparkSession
  
# creating sparksession and giving an app name
spark = SparkSession.builder.appName('sparkdf').getOrCreate()
  
# list  of students  data 
data =[["1","sravan","vignan"],
       ["2","ojaswi","vvit"],
       ["3","rohith","vvit"],
       ["4","sridevi","vignan"],
       ["1","sravan","vignan"], 
       ["5","gnanesh","iit"]]
  
# specify column names
columns=['student ID','student NAME','college']
  
# creating a dataframe from the lists of data
dataframe = spark.createDataFrame(data,columns)
  
print("Actual data in dataframe")
# show dataframe
dataframe.show()


Python3
# extract single value based on
# column in the dataframe
dataframe.first()['student ID']


Python3
# extract single value based
# on column in the dataframe
dataframe.head()[0]


Python3
# extract single value based
# on column in the dataframe
dataframe.head()[2]


输出:

Actual data in dataframe
+----------+------------+-------+
|student ID|student NAME|college|
+----------+------------+-------+
|         1|      sravan| vignan|
|         2|      ojaswi|   vvit|
|         3|      rohith|   vvit|
|         4|     sridevi| vignan|
|         1|      sravan| vignan|
|         5|     gnanesh|    iit|
+----------+------------+-------+

示例 1:使用 first() 从特定列中提取单个值的Python程序。

蟒蛇3

# extract single value based on
# column in the dataframe
dataframe.first()['student ID']

输出:

'1'

示例 2:使用 head() 提取单个值。

蟒蛇3

# extract single value based
# on column in the dataframe
dataframe.head()[0]

输出:

'1'

示例 3:使用 head() 提取单个值。

蟒蛇3

# extract single value based
# on column in the dataframe
dataframe.head()[2]

输出:

'vignan'