我有下面的代码, 输出中也需要状态和* policy工作流名称(即星期二), 任何人都可以帮忙。
import json
import requests
import pandas as pd
from pandas.io.json import json_normalize
import ast
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
pd.set_option('display.width', 100)a = [{'attributes': [{'key': '*policy action jobid', 'values': ['289903']}, {'key': '*policy action name', 'values': ['backup']}, {'key': '*policy name', 'values': ['Daily_Backups']}, {'key': '*policy workflow name', 'values': ['tuesday']}, {'key': 'clone retention policy', 'values': ['504:5:34']}, {'key': 'group', 'values': ['tuesday']}, {'key': 'saveset features', 'values': ['CLIENT_SAVETIME']}], 'browseTime': '2020-05-19T23:57:41+08:00', 'clientHostname': 'xyz.com', 'clientId': '7d391c52-00000004-5cda459d-5c1', 'creationTime': '2020-04-28T21:29:25+08:00', 'fileCount': 0, 'id': '1eb1', 'instances': [], 'level': 'Full', 'links': [{'href': 'https://iservera/backups/1ec1', 'rel': 'item'}], 'name': '/abc', 'retentionTime': '2020-05-19T23:57:41+08:00', 'saveTime': '2020-04-28T21:27:07+08:00', 'shortId': '2177', 'size': {'unit': 'Byte', 'value': 0}, 'type': 'File'}, {'attributes': [{'key': '*policy action jobid', 'values': ['2803']}, {'key': '*policy action name', 'values': ['backup: 1589']}, {'key': '*policy name', 'values': ['Daily_Backups: 159']}, {'key': '*policy workflow name', 'values': ['tuesday: 1588079529']}, {'key': '*ss clone retention', 'values': ['1588079529:1588079590:1824409']}, {'key': 'group', 'values': ['tuesday']}, {'key': 'saveset features', 'values': ['CLIENT_SAVETIME']}], 'browseTime': '2020-05-19T23:57:42+08:00', 'clientHostname': 'abc.com', 'clientId': 'ec3dc1', 'completionTime': '2020-04-28T21:29:47+08:00', 'creationTime': '2020-04-28T21:13:10+08:00', 'fileCount': 0, 'id': 'cc1', 'instances': [{'clone': False, 'id': '1588079529', 'status': 'Aborted', 'volumeIds': ['245614341']}], 'level': 'Full', 'links': [{'href': 'https://abc/backups/c771', 'rel': 'item'}], 'name': '/xyz', 'retentionTime': '2020-05-19T23:57:42+08:00', 'saveTime': '2020-04-28T21:10:53+08:00', 'shortId': '2141727718', 'size': {'unit': 'Byte', 'value': 36264099844}, 'type': 'NDMP'}]
df = json_normalize(a)
a = df[['clientHostname', 'completionTime', 'size.value', 'type', 'fileCount']]
print(a)```Current output is-
```clientHostnamecompletionTimesize.valuetypefileCount
0xyz.comNaN0File0
1abc.com2020-04-28T21:29:47+08:0036264099844NDMP0```Output needed is-
```clientHostnamecompletionTimesize.valuetypefileCount statusPolicy
0xyz.comNaN0File0-tuesday
1abc.com2020-04-28T21:29:47+08:0036264099844NDMP0Aborted tuesday```
#1你可以获取策略工作流名称为df [‘ attributes’ ] [0] [3] [‘ key’ ], df [‘ attributes’ ] [0] [3] [‘ values’ ]和状态为df [‘ instances ‘ ] [i] [0] [‘ status’ ], 其中i是记录号。
#2我将使用库jmespath遍历json数据:
要访问密钥, 请使用。要访问列表, 请使用[]表示法。
import jmespath
expression = jmespath.compile("""
[].
{clientHostname:clientHostname, completionTime:completionTime, "size.value":size.value, type:type, fileCount:fileCount, status:instances[].status, Policy:attributes[?key==`*policy workflow name`].values[]}
""")
res = expression.search(a)
res[{'clientHostname': 'xyz.com', 'completionTime': None, 'size.value': 0, 'type': 'File', 'fileCount': 0, 'status': [], 'Policy': ['tuesday']}, {'clientHostname': 'abc.com', 'completionTime': '2020-04-28T21:29:47+08:00', 'size.value': 36264099844, 'type': 'NDMP', 'fileCount': 0, 'status': ['Aborted'], 'Policy': ['tuesday: 1588079529']}]
进行一些清理以适合你的用例:
#get the day of the week for policy keys
res = [{key:value[0].split(':')[0]
if key=="Policy" else value
for key, value in ent.items()} for ent in res]#if list is empty, replace it with "-"
res = [{key:"-"
if (isinstance(value, list) and not value) else value
for key, value in ent.items()} for ent in res] pd.DataFrame(res)clientHostnamecompletionTimesize.valuetypefileCountstatusPolicy
0xyz.com None0File0-tuesday
1abc.com 2020-04-28T21:29:47+08:0036264099844 NDMP0[Aborted]tuesday
【如何从JSON中提取嵌套的值()】你可能需要查看数据, 看看是否需要进行其他转换
推荐阅读
- Python如何比较数据框列和2D列表()
- tuple()增加运行时间这么多是否正常()
- 使用JMESPath根据另一个查询结果过滤列表
- Python Pandas ffill if语句问题()
- Pip已自行卸载,但我不知道是什么原因造成的
- 通过Python转义HTML中的单引号()
- Python中的Altair(数据可视化)
- 2020年11个最佳iOS开发框架热门推荐
- 如何实现android的翻转动画(已添加GIF)