在 Python-Pandas 中循环或迭代数据框的所有或某些列
在本文中,我们将讨论如何循环或迭代 DataFrame 的整体或某些列?有多种方法可以完成此任务。
让我们首先创建一个 Dataframe 并查看:
代码 :
Python3
# import pandas package
import pandas as pd
# List of Tuples
students = [('Ankit', 22, 'A'),
('Swapnil', 22, 'B'),
('Priya', 22, 'B'),
('Shivangi', 22, 'B'),
]
# Create a DataFrame object
stu_df = pd.DataFrame(students, columns =['Name', 'Age', 'Section'],
index =['1', '2', '3', '4'])
stu_df
Python3
import pandas as pd
# List of Tuples
students = [('Ankit', 22, 'A'),
('Swapnil', 22, 'B'),
('Priya', 22, 'B'),
('Shivangi', 22, 'B'),
]
# Create a DataFrame object
stu_df = pd.DataFrame(students, columns =['Name', 'Age', 'Section'],
index =['1', '2', '3', '4'])
# gives a tuple of column name and series
# for each column in the dataframe
for (columnName, columnData) in stu_df.iteritems():
print('Column Name : ', columnName)
print('Column Contents : ', columnData.values)
Python3
import pandas as pd
# List of Tuples
students = [('Ankit', 22, 'A'),
('Swapnil', 22, 'B'),
('Priya', 22, 'B'),
('Shivangi', 22, 'B'),
]
# Create a DataFrame object
stu_df = pd.DataFrame(students, columns =['Name', 'Age', 'Section'],
index =['1', '2', '3', '4'])
# Iterate over column names
for column in stu_df:
# Select column contents by column
# name using [] operator
columnSeriesObj = stu_df[column]
print('Column Name : ', column)
print('Column Contents : ', columnSeriesObj.values)
Python3
import pandas as pd
# List of Tuples
students = [('Ankit', 22, 'A'),
('Swapnil', 22, 'B'),
('Priya', 22, 'B'),
('Shivangi', 22, 'B'),
]
# Create a DataFrame object
stu_df = pd.DataFrame(students, columns =['Name', 'Age', 'Section'],
index =['1', '2', '3', '4'])
# Iterate over two given columns
# only from the dataframe
for column in stu_df[['Name', 'Section']]:
# Select column contents by column
# name using [] operator
columnSeriesObj = stu_df[column]
print('Column Name : ', column)
print('Column Contents : ', columnSeriesObj.values)
Python3
import pandas as pd
# List of Tuples
students = [('Ankit', 22, 'A'),
('Swapnil', 22, 'B'),
('Priya', 22, 'B'),
('Shivangi', 22, 'B'),
]
# Create a DataFrame object
stu_df = pd.DataFrame(students, columns =['Name', 'Age', 'Section'],
index =['1', '2', '3', '4'])
# Iterate over the sequence of column names
# in reverse order
for column in reversed(stu_df.columns):
# Select column contents by column
# name using [] operator
columnSeriesObj = stu_df[column]
print('Column Name : ', column)
print('Column Contents : ', columnSeriesObj.values)
Python3
import pandas as pd
# List of Tuples
students = [('Ankit', 22, 'A'),
('Swapnil', 22, 'B'),
('Priya', 22, 'B'),
('Shivangi', 22, 'B'),
]
# Create a DataFrame object
stu_df = pd.DataFrame(students, columns =['Name', 'Age', 'Section'],
index =['1', '2', '3', '4'])
# Iterate over the index range from
# 0 to max number of columns in dataframe
for index in range(stu_df.shape[1]):
print('Column Number : ', index)
# Select column by index position using iloc[]
columnSeriesObj = stu_df.iloc[:, index]
print('Column Contents : ', columnSeriesObj.values)
输出 :
现在让我们看看不同的迭代方式或 DataFrame 的某些列:
方法 #1:使用DataFrame.iteritems() :
Dataframe 类提供了一个成员函数iteritems(),它提供了一个迭代器,可用于迭代数据框的所有列。对于 Dataframe 中的每一列,它都会返回一个迭代器到包含列名及其内容作为系列的元组。
代码 :
Python3
import pandas as pd
# List of Tuples
students = [('Ankit', 22, 'A'),
('Swapnil', 22, 'B'),
('Priya', 22, 'B'),
('Shivangi', 22, 'B'),
]
# Create a DataFrame object
stu_df = pd.DataFrame(students, columns =['Name', 'Age', 'Section'],
index =['1', '2', '3', '4'])
# gives a tuple of column name and series
# for each column in the dataframe
for (columnName, columnData) in stu_df.iteritems():
print('Column Name : ', columnName)
print('Column Contents : ', columnData.values)
输出:
方法 #2:使用 [ ]运算符:
我们可以遍历列名并选择我们想要的列。
代码 :
Python3
import pandas as pd
# List of Tuples
students = [('Ankit', 22, 'A'),
('Swapnil', 22, 'B'),
('Priya', 22, 'B'),
('Shivangi', 22, 'B'),
]
# Create a DataFrame object
stu_df = pd.DataFrame(students, columns =['Name', 'Age', 'Section'],
index =['1', '2', '3', '4'])
# Iterate over column names
for column in stu_df:
# Select column contents by column
# name using [] operator
columnSeriesObj = stu_df[column]
print('Column Name : ', column)
print('Column Contents : ', columnSeriesObj.values)
输出:
方法#3:迭代多于一列:
假设我们需要迭代不止一列。为了做到这一点,我们可以从数据框中选择多个列并对其进行迭代。
代码 :
Python3
import pandas as pd
# List of Tuples
students = [('Ankit', 22, 'A'),
('Swapnil', 22, 'B'),
('Priya', 22, 'B'),
('Shivangi', 22, 'B'),
]
# Create a DataFrame object
stu_df = pd.DataFrame(students, columns =['Name', 'Age', 'Section'],
index =['1', '2', '3', '4'])
# Iterate over two given columns
# only from the dataframe
for column in stu_df[['Name', 'Section']]:
# Select column contents by column
# name using [] operator
columnSeriesObj = stu_df[column]
print('Column Name : ', column)
print('Column Contents : ', columnSeriesObj.values)
输出:
方法#4:以相反的顺序迭代列:
我们也可以以相反的顺序遍历列。
代码 :
Python3
import pandas as pd
# List of Tuples
students = [('Ankit', 22, 'A'),
('Swapnil', 22, 'B'),
('Priya', 22, 'B'),
('Shivangi', 22, 'B'),
]
# Create a DataFrame object
stu_df = pd.DataFrame(students, columns =['Name', 'Age', 'Section'],
index =['1', '2', '3', '4'])
# Iterate over the sequence of column names
# in reverse order
for column in reversed(stu_df.columns):
# Select column contents by column
# name using [] operator
columnSeriesObj = stu_df[column]
print('Column Name : ', column)
print('Column Contents : ', columnSeriesObj.values)
输出:
方法#5:使用索引( iloc ):
要按索引迭代 Dataframe 的列,我们可以迭代一个范围,即 0 到最大列数,而不是每个索引,我们可以使用 iloc[] 选择列的内容。
代码 :
Python3
import pandas as pd
# List of Tuples
students = [('Ankit', 22, 'A'),
('Swapnil', 22, 'B'),
('Priya', 22, 'B'),
('Shivangi', 22, 'B'),
]
# Create a DataFrame object
stu_df = pd.DataFrame(students, columns =['Name', 'Age', 'Section'],
index =['1', '2', '3', '4'])
# Iterate over the index range from
# 0 to max number of columns in dataframe
for index in range(stu_df.shape[1]):
print('Column Number : ', index)
# Select column by index position using iloc[]
columnSeriesObj = stu_df.iloc[:, index]
print('Column Contents : ', columnSeriesObj.values)
输出: