問題描述
我需要解析一個 json 文件,不幸的是,它不符合原型.我的數據有兩個問題,但我已經找到了解決方法,所以我會在最后提到它,也許有人也可以提供幫助.
I need to parse a json file which unfortunately for me, does not follow the prototype. I have two issues with the data, but i've already found a workaround for it so i'll just mention it at the end, maybe someone can help there as well.
所以我需要像這樣解析條目:
So i need to parse entries like this:
"Test":{
"entry":{
"Type":"Something"
},
"entry":{
"Type":"Something_Else"
}
}, ...
json 默認解析器更新字典,因此只使用最后一個條目.我也必須以某種方式存儲另一個,我不知道該怎么做.我還必須以它們在文件中出現的相同順序將鍵存儲在幾個字典中,這就是我使用 OrderedDict 這樣做的原因.它工作正常,所以如果有任何方法可以用重復的條目來擴展它,我將不勝感激.
The json default parser updates the dictionary and therfore uses only the last entry. I HAVE to somehow store the other one as well, and i have no idea how to do this. I also HAVE to store the keys in the several dictionaries in the same order they appear in the file, thats why i am using an OrderedDict to do so. it works fine, so if there is any way to expand this with the duplicate entries i'd be grateful.
我的第二個問題是這個相同的 json 文件包含這樣的條目:
My second issue is that this very same json file contains entries like that:
"Test":{
{
"Type":"Something"
}
}
Json.load() 函數在到達 json 文件中的該行時引發異常.我解決此問題的唯一方法是自己手動刪除內括號.
Json.load() function raises an exception when it reaches that line in the json file. The only way i worked around this was to manually remove the inner brackets myself.
提前致謝
推薦答案
您可以使用 JSONDecoder.object_pairs_hook
自定義 JSONDecoder
解碼對象.這個鉤子函數將傳遞一個 (key, value)
對的列表,你通常會對其進行一些處理,然后變成 dict
.
You can use JSONDecoder.object_pairs_hook
to customize how JSONDecoder
decodes objects. This hook function will be passed a list of (key, value)
pairs that you usually do some processing on, and then turn into a dict
.
但是,由于 Python 字典不允許重復鍵(而且您根本無法更改),您可以在掛鉤中返回未更改的對并獲得 (key, value)<解碼 JSON 時的/code> 對:
However, since Python dictionaries don't allow for duplicate keys (and you simply can't change that), you can return the pairs unchanged in the hook and get a nested list of (key, value)
pairs when you decode your JSON:
from json import JSONDecoder
def parse_object_pairs(pairs):
return pairs
data = """
{"foo": {"baz": 42}, "foo": 7}
"""
decoder = JSONDecoder(object_pairs_hook=parse_object_pairs)
obj = decoder.decode(data)
print obj
輸出:
[(u'foo', [(u'baz', 42)]), (u'foo', 7)]
如何使用此數據結構取決于您.如上所述,Python 字典不允許重復鍵,并且沒有辦法解決這個問題.您甚至會如何根據鍵進行查找?dct[key]
會模棱兩可.
How you use this data structure is up to you. As stated above, Python dictionaries won't allow for duplicate keys, and there's no way around that. How would you even do a lookup based on a key? dct[key]
would be ambiguous.
因此,您可以實現自己的邏輯以按照您期望的方式處理查找,或者實現某種避免沖突以使鍵唯一(如果它們不是唯一的),然后然后創建嵌套列表中的字典.
So you can either implement your own logic to handle a lookup the way you expect it to work, or implement some sort of collision avoidance to make keys unique if they're not, and then create a dictionary from your nested list.
編輯:既然您說要修改重復鍵以使其唯一,那么您可以這樣做:
Edit: Since you said you would like to modify the duplicate key to make it unique, here's how you'd do that:
from collections import OrderedDict
from json import JSONDecoder
def make_unique(key, dct):
counter = 0
unique_key = key
while unique_key in dct:
counter += 1
unique_key = '{}_{}'.format(key, counter)
return unique_key
def parse_object_pairs(pairs):
dct = OrderedDict()
for key, value in pairs:
if key in dct:
key = make_unique(key, dct)
dct[key] = value
return dct
data = """
{"foo": {"baz": 42, "baz": 77}, "foo": 7, "foo": 23}
"""
decoder = JSONDecoder(object_pairs_hook=parse_object_pairs)
obj = decoder.decode(data)
print obj
輸出:
OrderedDict([(u'foo', OrderedDict([(u'baz', 42), ('baz_1', 77)])), ('foo_1', 7), ('foo_2', 23)])
make_unique
函數負責返回一個無沖突的密鑰.在這個例子中,它只是用 _n
作為鍵的后綴,其中 n
是一個增量計數器 - 只需根據您的需要調整它即可.
The make_unique
function is responsible for returning a collision-free key. In this example it just suffixes the key with _n
where n
is an incremental counter - just adapt it to your needs.
因為 object_pairs_hook
完全按照它們在 JSON 文檔中出現的順序接收對,所以也可以通過使用 OrderedDict
來保留該順序,我將其包含為好吧.
Because the object_pairs_hook
receives the pairs exactly in the order they appear in the JSON document, it's also possible to preserve that order by using an OrderedDict
, I included that as well.
這篇關于Python json 解析器允許重復鍵的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!