Skip to content Skip to sidebar Skip to footer

Parsing Xml By Python Lxml Tree.xpath

I try to parse a huge file. The sample is below. I try to take , but I can't It works only without this string
tree.xpath('.//LevelLayout[@levelGuid="4a54f032-325e-4988-8621-2cb7b49d8432"]//*[local-name()="Name"]')

to match the Name element with a shorter XPath expression (ignoring the namespace altogether).

The alternative is to use a prefix-to-namespace mapping and use those on your tags:

nsmap = {'acd': 'http://schemas.datacontract.org/2004/07/ArcherTech.Common.Domain'}

tree.xpath('/PackageLevelLayout/LevelLayouts/LevelLayout[@levelGuid="4a54f032-325e-4988-8621-2cb7b49d8432"]/acd:LevelLayout/acd:LevelLayoutSectionBase/acd:LevelLayoutItemBase/acd:Name',
    namespaces=nsmap)

Solution 2:

lxml's xpath method has a namespaces parameter. You can pass it a dict mapping namespace prefixes to namespaces. Then you can refer build XPaths that use the namespace prefix:

xml2 = '''<?xml version="1.0" encoding="UTF-8"?>
<PackageLevelLayout>
<LevelLayouts>
    <LevelLayout levelGuid="4a54f032-325e-4988-8621-2cb7b49d8432">
                <LevelLayout xmlns="http://schemas.datacontract.org/2004/07/ArcherTech.Common.Domain" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
                    <LevelLayoutSectionBase>
                        <LevelLayoutItemBase>
                            <Name>Tracking ID</Name>
                        </LevelLayoutItemBase>
                    </LevelLayoutSectionBase>
                </LevelLayout>
            </LevelLayout>
    </LevelLayouts>
</PackageLevelLayout>'''

namespaces={'ns': 'http://schemas.datacontract.org/2004/07/ArcherTech.Common.Domain',
            'i': 'http://www.w3.org/2001/XMLSchema-instance'}

import lxml.etree as ET
# This is an lxml.etree._Element, not a tree, so don't call it tree
root = ET.XML(xml2)

nodes = root.xpath(
    '''/PackageLevelLayout/LevelLayouts/LevelLayout[@levelGuid="4a54f032-325e-4988-8621-2cb7b49d8432"]
       /ns:LevelLayout/ns:LevelLayoutSectionBase/ns:LevelLayoutItemBase/ns:Name''', namespaces = namespaces)
print nodes

yields

[<Element {http://schemas.datacontract.org/2004/07/ArcherTech.Common.Domain}Name at 0xb74974dc>]

Post a Comment for "Parsing Xml By Python Lxml Tree.xpath"