問題描述
我有一個xml,它的一小部分看起來像這樣:
I have an xml, small part of it looks like this:
<?xml version="1.0" ?>
<i:insert xmlns:i="urn:com:xml:insert" xmlns="urn:com:xml:data">
<data>
<image imageId="1"></image>
<content>Content</content>
</data>
</i:insert>
當(dāng)我使用 ElementTree
解析它并將其保存到一個文件中時,我看到以下內(nèi)容:
When i parse it using ElementTree
and save it to a file i see following:
<ns0:insert xmlns:ns0="urn:com:xml:insert" xmlns:ns1="urn:com:xml:data">
<ns1:data>
<ns1:image imageId="1"></ns1:image>
<ns1:content>Content</ns1:content>
</ns1:data>
</ns0:insert>
為什么它會改變前綴并將它們放在任何地方?使用 minidom
我沒有這樣的問題.配置好了嗎?ElementTree
的文檔很差.問題是,在這樣的解析之后我找不到任何節(jié)點,例如 image
- 如果我像 {namespace}image一樣使用它,無論有沒有命名空間都找不到它code> 或只是
image
.為什么?任何建議都非常感謝.
Why does it change prefixes and put them everywhere? Using minidom
i don't have such problem. Is it configured? Documentation for ElementTree
is very poor.
The problem is, that i can't find any node after such parsing, for example image
- can't find it with or without namespace if i use it like {namespace}image
or just image
. Why's that? Any suggestions are strongly appreciated.
我已經(jīng)嘗試過的:
import xml.etree.ElementTree as ET
tree = ET.parse('test.xml')
root = tree.getroot()
for a in root.findall('ns1:image'):
print a.attrib
這會返回一個錯誤,而另一個則什么也不返回:
This returns an error and the other one returns nothing:
for a in root.findall('{urn:com:xml:data}image'):
print a.attrib
我也嘗試過制作這樣的命名空間并使用它:
I also tried to make namespace like this and use it:
namespaces = {'ns1': 'urn:com:xml:data'}
for a in root.findall('ns1:image', namespaces):
print a.attrib
它什么也不返回.我做錯了什么?
It returns nothing. What am i doing wrong?
推薦答案
這個片段來自你的問題,
This snippet from your question,
for a in root.findall('{urn:com:xml:data}image'):
print a.attrib
不輸出任何內(nèi)容,因為它只查找樹根的直接 {urn:com:xml:data}image
子級.
does not output anything because it only looks for direct {urn:com:xml:data}image
children of the root of the tree.
這個稍加修改的代碼,
for a in root.findall('.//{urn:com:xml:data}image'):
print a.attrib
將打印 {'imageId': '1'}
因為它使用 .//
,它會選擇所有級別的匹配子元素.
will print {'imageId': '1'}
because it uses .//
, which selects matching subelements on all levels.
參考:https://docs.python.org/2/library/xml.etree.elementtree.html#supported-xpath-syntax.
ElementTree 默認(rèn)情況下不僅保留原始命名空間前綴有點煩人,但請記住,無論如何,前綴并不重要.register_namespace()
函數(shù)可用于在序列化 XML 時設(shè)置所需的前綴.該函數(shù)對解析或搜索沒有任何影響.
It is a bit annoying that ElementTree does not just retain the original namespace prefixes by default, but keep in mind that it is not the prefixes that matter anyway. The register_namespace()
function can be used to set the wanted prefix when serializing the XML. The function does not have any effect on parsing or searching.
這篇關(guān)于在 Python 中使用 ElementTree 解析帶有命名空間的 XML的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網(wǎng)!