如何在Python中将 pandas DataFrame 转换为 JSON？

数据分析是当今世界极为重要的工具。数据分析的一个关键方面是数据的有组织的表示。计算机科学中有许多数据结构可以完成这项任务。在本文中，我们讨论了两种这样的数据结构，即。熊猫数据帧和JSON 。此外，我们还将了解如何将 DataFrame 转换为 JSON 格式。

Pandas DataFrame 是数据的表格表示形式，其中列表示单个数据条目中的各种数据点，每一行是唯一的数据条目。而 JSON 是用 JavaScript 对象表示法编写的文本。

注意：更多信息请参考Python |熊猫数据框

将 pandas DataFrame 转换为 JSON

要将 pandas DataFrames 转换为 JSON 格式，我们使用Python中 pandas 库中的函数DataFrame.to_json() 。 to_json函数中有多种自定义可用于实现所需的 JSON 格式。先看看函数接受的参数，再探索自定义

参数：

Parameter	Value	Use
path_or_buf	string or filename, optional	File path or object. If not specified, the result is returned as a string.
orient	‘split’, ‘records’, ‘index’, ‘columns’, ‘values’, ‘table’, default=’index’	Indication of expected JSON string format.
date_format	None, ‘epoch’, ‘iso’, default=’epoch’	Type of date conversion. ‘epoch’ = epoch milliseconds, ‘iso’ = ISO8601. The default depends on the orient. For orient=’table’, the default is ‘iso’. For all other orients, the default is ‘epoch’.
double_precision	integer value, default=10	The number of decimal places to use when encoding floating point values.
force_ascii	boolean value, default=True	Force encoded string to be ASCII.
date_unit	‘s’, ‘ms’, ‘us’, ‘ns’, default=’ms’	The time unit to encode to, governs timestamp and ISO8601 precision. The values represent second, millisecond, microsecond, and nanosecond respectively.
default_handler	callable function	Handler to call if object cannot otherwise be converted to a suitable format for JSON. Should receive a single argument which is the object to convert and return a serializable object.
lines	boolean value, default=False	If ‘orient’ is ‘records’ write out line delimited json format. Will throw ValueError if incorrect ‘orient’ since others are not list like.
compression	‘infer’, ‘gzip’, ‘bz2’, ‘zip’, ‘xz’, None, default=’infer’	A string representing the compression to use in the output file, only used when the first argument is a filename. By default, the compression is inferred from the filename.
index	boolean value, default=True	Whether to include the index values in the JSON string. Not including the index (index=False) is only supported when orient is ‘split’ or ‘table’.
indent	integer value	Length of whitespace used to indent each record. Optional argument need not be mentioned.

我们现在看几个例子来了解函数DataFrame.to_json 的用法。

示例 1：基本用法

import numpy as np
import pandas as pd
  
  
data = np.array([['1', '2'], ['3', '4']])
  
dataFrame = pd.DataFrame(data, columns = ['col1', 'col2'])
json = dataFrame.to_json()
print(json)

输出：

{"col1":{"0":"1", "1":"3"}, "col2":{"0":"2", "1":"4"}}

示例 2：探索 DataFrame.to_json函数的 'orient' 属性

import numpy as np
import pandas as pd
  
  
data = np.array([['1', '2'], ['3', '4']])
   
dataFrame = pd.DataFrame(data, columns = ['col1', 'col2'])
json = dataFrame.to_json()
print(json)
  
json_split = dataFrame.to_json(orient ='split')
print("json_split = ", json_split, "\n")
   
json_records = dataFrame.to_json(orient ='records')
print("json_records = ", json_records, "\n")
   
json_index = dataFrame.to_json(orient ='index')
print("json_index = ", json_index, "\n")
   
json_columns = dataFrame.to_json(orient ='columns')
print("json_columns = ", json_columns, "\n")
   
json_values = dataFrame.to_json(orient ='values')
print("json_values = ", json_values, "\n")
   
json_table = dataFrame.to_json(orient ='table')
print("json_table = ", json_table, "\n")

输出：

json_split = {“columns”:[“col1”, “col2”], “index”:[0, 1], “data”:[[“1”, “2”], [“3”, “4”]]}

json_records = [{“col1″:”1”, “col2″:”2”}, {“col1″:”3”, “col2″:”4”}]

json_index = {“0”:{“col1″:”1”, “col2″:”2”}, “1”:{“col1″:”3”, “col2″:”4”}}

json_columns = {“col1”:{“0″:”1”, “1”:”3″}, “col2”:{“0″:”2”, “1”:”4″}}

json_values = [[“1”, “2”], [“3”, “4”]]

json_table = {“schema”:{“fields”:[{“name”:”index”, “type”:”integer”}, {“name”:”col1″, “type”:”string”}, {“name”:”col2″, “type”:”string”}], “primaryKey”:[“index”], “pandas_version”:”0.20.0″}, “data”:[{“index”:0, “col1″:”1”, “col2″:”2”}, {“index”:1, “col1″:”3”, “col2″:”4”}]}

编程需要懂一点英语