案例一
douban.json 文件内容截图如下:
python脚本内容:
import json
import pandas as pd
df = pd.read_json("./douban.json", lines=True, encoding="utf-8")#lines是按行读取的意思
df.to_excel("./douban.xlsx")#输出表格douban.xlsx
print("over.....")
douban.xlsx 表格内容截图如下
案例二
ceshi.json 文件内容
[{"ttery":"[123]","issue":"20130801-3391"},{"ttery":"[123]","issue":"20130801-3390"},{"ttery":"[123]","issue":"20130801-3389"}]
python脚本内容:
# -*- coding: utf-8 -*-
import pandas as pd
file = open('ceshi.json', 'r', encoding='utf-8')
df = pd.read_json(file, orient='records')
df.to_excel('pandas处理ceshi-json.xlsx', index=False, columns=["ttery", "issue"])
输出的 “pandas处理ceshi-json.xlsx”文件内容:
案例三
Pandas pd.json_normalize() 读取半结构化 JSON
data = [
{
"state": "Florida",
"shortname": "FL",
"info": {"governor": "Rick Scott", "web": "gairuo.com"},
"counties": [
{"name": "Dade", "population": 12345},
{"name": "Broward", "population": 40000},
{"name": "Palm Beach", "population": 60000},
],
},
{
"state": "Ohio",
"shortname": "OH",
"info": {"governor": "John Kasich", "web": "sin80.com"},
"counties": [
{"name": "Summit", "population": 1234},
{"name": "Cuyahoga", "population": 1337},
],
},
]
# 记录都在 counties,元数据取 state、shortname 和
# info 中的 web, info 中的 governor 被舍弃
result = pd.json_normalize(
data,
record_path="counties",
meta=["state", "shortname", ["info", "web"]]
)
result
'''
name population state shortname info.web
0 Dade 12345 Florida FL gairuo.com
1 Broward 40000 Florida FL gairuo.com
2 Palm Beach 60000 Florida FL gairuo.com
3 Summit 1234 Ohio OH sin80.com
4 Cuyahoga 1337 Ohio OH sin80.com
'''