0
点赞
收藏
分享

微信扫一扫

python利用 pandas 库处理 json 文件数据


案例一

douban.json 文件内容截图如下:

python利用 pandas 库处理 json 文件数据_开发语言


python脚本内容:

import json
import pandas as pd

df = pd.read_json("./douban.json", lines=True, encoding="utf-8")#lines是按行读取的意思
df.to_excel("./douban.xlsx")#输出表格douban.xlsx
print("over.....")

douban.xlsx 表格内容截图如下

python利用 pandas 库处理 json 文件数据_结构化_02

案例二

ceshi.json 文件内容

[{"ttery":"[123]","issue":"20130801-3391"},{"ttery":"[123]","issue":"20130801-3390"},{"ttery":"[123]","issue":"20130801-3389"}]

python脚本内容:

# -*- coding: utf-8 -*-

import pandas as pd

file = open('ceshi.json', 'r', encoding='utf-8')

df = pd.read_json(file, orient='records')
df.to_excel('pandas处理ceshi-json.xlsx', index=False, columns=["ttery", "issue"])

输出的 “pandas处理ceshi-json.xlsx”文件内容:

python利用 pandas 库处理 json 文件数据_python_03

案例三

Pandas pd.json_normalize() 读取半结构化 JSON

data = [
{
"state": "Florida",
"shortname": "FL",
"info": {"governor": "Rick Scott", "web": "gairuo.com"},
"counties": [
{"name": "Dade", "population": 12345},
{"name": "Broward", "population": 40000},
{"name": "Palm Beach", "population": 60000},
],
},
{
"state": "Ohio",
"shortname": "OH",
"info": {"governor": "John Kasich", "web": "sin80.com"},
"counties": [
{"name": "Summit", "population": 1234},
{"name": "Cuyahoga", "population": 1337},
],
},
]

# 记录都在 counties,元数据取 state、shortname 和
# info 中的 web, info 中的 governor 被舍弃
result = pd.json_normalize(
data,
record_path="counties",
meta=["state", "shortname", ["info", "web"]]
)
result
'''
name population state shortname info.web
0 Dade 12345 Florida FL gairuo.com
1 Broward 40000 Florida FL gairuo.com
2 Palm Beach 60000 Florida FL gairuo.com
3 Summit 1234 Ohio OH sin80.com
4 Cuyahoga 1337 Ohio OH sin80.com
'''


举报

相关推荐

0 条评论