📅  最后修改于: 2023-12-03 15:20:11.606000             🧑  作者: Mango
As a programmer working with big data, Spark DataFrame has become a popular choice for its powerful capabilities in handling structured and semi-structured data. One of the fundamental operations when working with data is understanding the shape of the data. To facilitate this, Spark provides the df.shape
method to easily obtain the shape of a DataFrame in Python.
Here's the basic syntax for using df.shape
:
df.shape
This method takes no parameters.
The df.shape
method returns a tuple of two integers representing the number of rows and columns in the DataFrame, respectively.
Let's suppose we have a DataFrame df
with the following data:
| name | age | gender | |--------|-----|--------| | Alice | 25 | F | | Bob | 30 | M | | Claire | 35 | F | | David | 40 | M |
We can use df.shape
to obtain the shape of the DataFrame:
shape = df.shape
print(shape)
Output:
(4, 3)
This tells us that the DataFrame df
has 4 rows and 3 columns.
In this article, we discussed the df.shape
method in Spark DataFrame for Python, which is a convenient way to obtain the shape of a DataFrame. The method returns a tuple of two integers representing the number of rows and columns in the DataFrame, respectively. Knowing the shape of a DataFrame is an important first step in performing various operations on the DataFrame.