PySpark – 从 DataFrame 中提取单个值
在本文中,我们将从 pyspark 数据框列中提取单个值。为此,我们将使用 first() 和 head() 函数。
单值意味着只有一个值,我们可以根据列名提取这个值
Syntax:
- dataframe.first()[‘column name’]
- Dataframe.head()[‘Index’]
Where,
- dataframe is the input dataframe and column name is the specific column
- Index is the row and columns.
所以我们将使用嵌套列表创建数据框。
Python3
# importing module
import pyspark
# importing sparksession from pyspark.sql module
from pyspark.sql import SparkSession
# creating sparksession and giving an app name
spark = SparkSession.builder.appName('sparkdf').getOrCreate()
# list of students data
data =[["1","sravan","vignan"],
["2","ojaswi","vvit"],
["3","rohith","vvit"],
["4","sridevi","vignan"],
["1","sravan","vignan"],
["5","gnanesh","iit"]]
# specify column names
columns=['student ID','student NAME','college']
# creating a dataframe from the lists of data
dataframe = spark.createDataFrame(data,columns)
print("Actual data in dataframe")
# show dataframe
dataframe.show()
Python3
# extract single value based on
# column in the dataframe
dataframe.first()['student ID']
Python3
# extract single value based
# on column in the dataframe
dataframe.head()[0]
Python3
# extract single value based
# on column in the dataframe
dataframe.head()[2]
输出:
Actual data in dataframe
+----------+------------+-------+
|student ID|student NAME|college|
+----------+------------+-------+
| 1| sravan| vignan|
| 2| ojaswi| vvit|
| 3| rohith| vvit|
| 4| sridevi| vignan|
| 1| sravan| vignan|
| 5| gnanesh| iit|
+----------+------------+-------+
示例 1:使用 first() 从特定列中提取单个值的Python程序。
蟒蛇3
# extract single value based on
# column in the dataframe
dataframe.first()['student ID']
输出:
'1'
示例 2:使用 head() 提取单个值。
蟒蛇3
# extract single value based
# on column in the dataframe
dataframe.head()[0]
输出:
'1'
示例 3:使用 head() 提取单个值。
蟒蛇3
# extract single value based
# on column in the dataframe
dataframe.head()[2]
输出:
'vignan'