📜  将多个 JSON 文件转换为 CSV Python

📅  最后修改于: 2022-05-13 01:54:38.669000             🧑  作者: Mango

将多个 JSON 文件转换为 CSV Python

在本文中,我们将学习如何在Python中将多个 JSON 文件转换为 CSV 文件。在此之前,请回忆一些术语:

  • JSON 文件:JSON 文件可能是以 JavaScript Object Notation (JSON) 格式存储简单数据结构和对象的文件,该格式可能是标准的数据交换格式。它主要用于在互联网应用程序和服务器之间传输数据。
  • CSV 文件:CSV 可以是逗号分隔值文件,它允许以表格格式保存数据。 CSV 看起来就像一个普通的电子表格,但带有.CSV扩展名。 CSV 文件通常用于几乎所有电子表格程序,如 Microsoft Excel 或 Google 电子表格。

要从多个 JSON 文件形成一个 CSV 文件,我们必须使用嵌套的 json 文件,将数据框展平或将 json 文件加载到数据框的形式,将它们连接/合并/加入以形成一个数据框(至少应该有一列在所有 json 文件中都相同),最后将此数据帧转换为 CSV 文件。借助下面给出的示例,可以理解给定任务的完整过程:

示例 1:如果所有列都匹配

在此示例中,我们将加载两个 json 文件,将一个连接到另一个并转换为 CSV 文件。用于此的 json 文件是:

文件1.json

{
    "ID":{
        "0":23,
        "1":43,
        "2":12,
        "3":13,
        "4":67,
        "5":89
        },
    "Name":{
        "0":"Ram",
        "1":"Deep",
        "2":"Yash",
        "3":"Aman",
        "4":"Arjun",
        "5":"Aditya"
        },
    "Marks":{
        "0":89,
        "1":97,
        "2":45,
        "3":78,
        "4":56,
        "5":76
        },
    "Grade":{
        "0":"B",
        "1":"A",
        "2":"F",
        "3":"C",
        "4":"E",
        "5":"C"
        }
}


文件2.json

{
    "ID":{
        "0":90,
        "1":56,
        "2":34,
        "3":96,
        "4":45
         },
    "Name":{
        "0":"Akash",
        "1":"Chalsea",
        "2":"Divya",
        "3":"Sajal",
        "4":"Shubham"
        },
    "Marks":{
        "0":81,
        "1":87,
        "2":100,
        "3":89,
        "4":78
        },
    "Grade":{
        "0":"B",
        "1":"B",
        "2":"A",
        "3":"B",
        "4":"C"
        }
}


第 1 步:在 pandas 数据框的帮助下加载 json 文件。
第 2 步:将数据帧连接成一个数据帧。
第 3 步:将串联的数据帧转换为 CSV 文件。

带有结果的完整代码如下所示:

代码:

Python3
# importing packages
import pandas as pd
  
# load json file using pandas
df1 = pd.read_json('file1.json')
  
# view data
print(df1)
  
# load json file using pandas
df2 = pd.read_json('file2.json')
  
# view data
print(df2)
  
# use pandas.concat method 
df = pd.concat([df1,df2])
  
# view the concatenated dataframe
print(df)
  
# convert dataframe to csv file
df.to_csv("CSV.csv",index=False)
  
# load the resultant csv file
result = pd.read_csv("CSV.csv")
  
# and view the data
print(result)


Python3
# importing packages
import pandas as pd
  
# load json file using pandas
df1 = pd.read_json('file3.json')
  
# view data
print(df1)
  
# load json file using pandas
df2 = pd.read_json('file4.json')
  
# view data
print(df2)
  
# use pandas.merge method
df_inner = pd.merge(df1, df2, how='inner', left_on=[
                    'ID', 'Name'], right_on=['ID', 'Name'])
df_outer = pd.merge(df1, df2, how='outer', left_on=[
                    'ID', 'Name'], right_on=['ID', 'Name'])
df_left = pd.merge(df1, df2, how='left', left_on=[
                   'ID', 'Name'], right_on=['ID', 'Name'])
df_right = pd.merge(df1, df2, how='right', left_on=[
                    'ID', 'Name'], right_on=['ID', 'Name'])
  
# convert dataframe to csv file
df_inner.to_csv("CSV_inner.csv", index=False)
df_outer.to_csv("CSV_outer.csv", index=False)
df_left.to_csv("CSV_left.csv", index=False)
df_right.to_csv("CSV_right.csv", index=False)
  
# load the resultant csv file
result_inner = pd.read_csv("CSV_inner.csv")
result_outer = pd.read_csv("CSV_outer.csv")
result_left = pd.read_csv("CSV_left.csv")
result_right = pd.read_csv("CSV_right.csv")
  
# and view the data
print(result_outer)
print(result_inner)
print(result_left)
print(result_right)


Python3
# importing packages
import pandas as pd
import json
  
# load json file using json.load
with open('file5.json') as file:
    data = json.load(file)
  
# view data
print(data)
  
# form the dataframe
df = pd.DataFrame(data['tickets'])
  
# view dataframe
print(df)
  
# flattern the dataframe and remove unnecessary columns
for i, item in enumerate(df['Location']):
    df['location_city'] = dict(df['Location'])[i]['City']
    df['location_state'] = dict(df['Location'])[i]['State']
  
for i, item in enumerate(df['hobbies']):
    df['hobbies_{}'.format(i)] = dict(df['hobbies'])[i]
  
df = df.drop({'Location', 'hobbies'}, axis=1)
  
# view dataframe
print(df)
  
# convert dataframe to csv file
df.to_csv("CSV.csv", index=False)
  
# load the resultant csv file
result = pd.read_csv("CSV.csv")
  
# and view the data
print(result)


输出:

ID    Name  Marks Grade
0  23     Ram     89     B
1  43    Deep     97     A
2  12    Yash     45     F
3  13    Aman     78     C
4  67   Arjun     56     E
5  89  Aditya     76     C

   ID     Name  Marks Grade
0  90    Akash     81     B
1  56  Chalsea     87     B
2  34    Divya    100     A
3  96    Sajal     89     B
4  45  Shubham     78     C

   ID     Name  Marks Grade
0  23      Ram     89     B
1  43     Deep     97     A
2  12     Yash     45     F
3  13     Aman     78     C
4  67    Arjun     56     E
5  89   Aditya     76     C
0  90    Akash     81     B
1  56  Chalsea     87     B
2  34    Divya    100     A
3  96    Sajal     89     B
4  45  Shubham     78     C

    ID     Name  Marks Grade
0   23      Ram     89     B
1   43     Deep     97     A
2   12     Yash     45     F
3   13     Aman     78     C
4   67    Arjun     56     E
5   89   Aditya     76     C
6   90    Akash     81     B
7   56  Chalsea     87     B
8   34    Divya    100     A
9   96    Sajal     89     B
10  45  Shubham     78     C

示例 2:如果某些列匹配

在这个例子中,我们将加载两个 json 文件,合并它们并转换为 CSV 文件。用于此的 json 文件是:

文件3.json

{
    "ID":{
        "0":23,
        "1":43,
        "2":12,
        "3":13,
        "4":67,
        "5":89
        },
    "Name":{
        "0":"Ram",
        "1":"Deep",
        "2":"Yash",
        "3":"Aman",
        "4":"Arjun",
        "5":"Aditya"
        },
    "Marks":{
        "0":89,
        "1":97,
        "2":45,
        "3":78,
        "4":56,
        "5":76
        }
}

文件4.json

{
    "ID":{
        "0":23,
        "1":43,
        "2":12,
        "3":67,
        "4":89
        },
    "Name":{
        "0":"Ram",
        "1":"Deep",
        "2":"Yash",
        "3":"Arjun",
        "4":"Aditya"
        },
    "Grade":{
        "0":"B",
        "1":"A",
        "2":"F",
        "3":"E",
        "4":"C"
        }
}

第 1 步:在 pandas 数据框的帮助下加载 json 文件。

第 2 步:通过不同的方法将数据帧合并为内/外/左/右连接。

第 3 步:将合并后的数据框转换为 CSV 文件。

带有结果的完整代码如下所示:

代码:

蟒蛇3

# importing packages
import pandas as pd
  
# load json file using pandas
df1 = pd.read_json('file3.json')
  
# view data
print(df1)
  
# load json file using pandas
df2 = pd.read_json('file4.json')
  
# view data
print(df2)
  
# use pandas.merge method
df_inner = pd.merge(df1, df2, how='inner', left_on=[
                    'ID', 'Name'], right_on=['ID', 'Name'])
df_outer = pd.merge(df1, df2, how='outer', left_on=[
                    'ID', 'Name'], right_on=['ID', 'Name'])
df_left = pd.merge(df1, df2, how='left', left_on=[
                   'ID', 'Name'], right_on=['ID', 'Name'])
df_right = pd.merge(df1, df2, how='right', left_on=[
                    'ID', 'Name'], right_on=['ID', 'Name'])
  
# convert dataframe to csv file
df_inner.to_csv("CSV_inner.csv", index=False)
df_outer.to_csv("CSV_outer.csv", index=False)
df_left.to_csv("CSV_left.csv", index=False)
df_right.to_csv("CSV_right.csv", index=False)
  
# load the resultant csv file
result_inner = pd.read_csv("CSV_inner.csv")
result_outer = pd.read_csv("CSV_outer.csv")
result_left = pd.read_csv("CSV_left.csv")
result_right = pd.read_csv("CSV_right.csv")
  
# and view the data
print(result_outer)
print(result_inner)
print(result_left)
print(result_right)

输出:

ID    Name  Marks
0  23     Ram     89
1  43    Deep     97
2  12    Yash     45
3  13    Aman     78
4  67   Arjun     56
5  89  Aditya     76

   ID    Name Grade
0  23     Ram     B
1  43    Deep     A
2  12    Yash     F
3  67   Arjun     E
4  89  Aditya     C

   ID    Name  Marks Grade
0  23     Ram     89     B
1  43    Deep     97     A
2  12    Yash     45     F
3  13    Aman     78   NaN
4  67   Arjun     56     E
5  89  Aditya     76     C

   ID    Name  Marks Grade
0  23     Ram     89     B
1  43    Deep     97     A
2  12    Yash     45     F
3  67   Arjun     56     E
4  89  Aditya     76     C

   ID    Name  Marks Grade
0  23     Ram     89     B
1  43    Deep     97     A
2  12    Yash     45     F
3  13    Aman     78   NaN
4  67   Arjun     56     E
5  89  Aditya     76     C

   ID    Name  Marks Grade
0  23     Ram     89     B
1  43    Deep     97     A
2  12    Yash     45     F
3  67   Arjun     56     E
4  89  Aditya     76     C

示例 3:如果给出了嵌套的 json 文件

在本例中,我们将加载嵌套的 json 文件,将其展平,然后转换为 CSV 文件。用于此的 json 文件是:

文件5.json

{
  "tickets":[
    {
      "Name": "Liam",
      "Location": {
        "City": "Los Angeles",
        "State": "CA"
      },
      "hobbies": [
        "Piano",
        "Sports"
      ],
      "year" : 1985,
      "teamId" : "ATL",
      "playerId" : "barkele01",
      "salary" : 870000
    },
    {
      "Name": "John",
      "Location": {
        "City": "Los Angeles",
        "State": "CA"
      },
      "hobbies": [
        "Music",
        "Running"
      ],
      "year" : 1985,
      "teamId" : "ATL",
      "playerId" : "bedrost01",
      "salary" : 550000
    }
  ],
  "count": 2
}

第 1 步:借助 json.load() 方法加载嵌套的 json 文件。

第 2 步:使用 pandas 方法展平不同的列值。

第 3 步:将扁平化的数据框转换为 CSV 文件。

对两个嵌套文件重复上述步骤,然后按照示例 1 或示例 2 进行转换。要转换单个嵌套的 json 文件,请遵循下面给出的方法。

带有结果的完整代码如下所示:

代码:

蟒蛇3

# importing packages
import pandas as pd
import json
  
# load json file using json.load
with open('file5.json') as file:
    data = json.load(file)
  
# view data
print(data)
  
# form the dataframe
df = pd.DataFrame(data['tickets'])
  
# view dataframe
print(df)
  
# flattern the dataframe and remove unnecessary columns
for i, item in enumerate(df['Location']):
    df['location_city'] = dict(df['Location'])[i]['City']
    df['location_state'] = dict(df['Location'])[i]['State']
  
for i, item in enumerate(df['hobbies']):
    df['hobbies_{}'.format(i)] = dict(df['hobbies'])[i]
  
df = df.drop({'Location', 'hobbies'}, axis=1)
  
# view dataframe
print(df)
  
# convert dataframe to csv file
df.to_csv("CSV.csv", index=False)
  
# load the resultant csv file
result = pd.read_csv("CSV.csv")
  
# and view the data
print(result)

输出: