I figured out why my XML parsing code works fine using the [pure-Python ElementTree XML parsing module][elementtree] but fails when using [the speedy and memory-optimized cElementTree XML parsing module][celementtree].
[The XPath 1.0 specification][xpath] says `’.’` is short-hand for `’self::node()’`, selecting a node itself.
Parsing an XML document and selecting the context node with ElementTree in Python 2.5:
>>> from xml.etree import ElementTree
>>> ElementTree.VERSION
‘1.2.6’
>>> doc = “
>>> node1 = ElementTree.fromstring(doc).find(‘./Example’)
>>> node1
>>> node1.find(‘.’)
>>> node1.find(‘.’) == node1
True
See how the result of `node1.find(‘.’)` is the node itself? [As it should be][selfnode].
Parsing an XML document and selecting the context node with cElementTree in Python 2.5:
>>> from xml.etree import cElementTree
>>> doc = “
>>> node2 = cElementTree.fromstring(doc).find(‘./Example’)
>>> node2
>>> node2.find(‘.’)
>>> node2.find(‘.’) == node2
False
Balls. The result of `node2.find(‘.’)` is `None`.
However! I have a kludgey work-around that works whether you use ElementTree or cElementTree. Use `’./’` instead of `’.’`:
>>> node1.find(‘./’)
>>> node1.find(‘./’) == node1
True
>>> node2.find(‘./’)
>>> node2.find(‘./’) == node2
True
*Kludgey because `’./’` is not a valid XPath expression.*
So we are back on track. Also works for Python 2.6 which has the same version of ElementTree.
Fortunately Python 2.7 got a new version of ElementTree and the bug is fixed:
>>> from xml.etree import ElementTree
>>> ElementTree.VERSION
‘1.3.0’
>>> doc = “
>>> node3 = ElementTree.fromstring(doc).find(‘./Example’)
>>> node3
>>> node3.find(‘.’)
>>> node3.find(‘.’) == node3
True
However! They also fixed my kludgey work-around:
>>> node3.find(‘./’)
>>> node3.find(‘./’) == node3
False
So I can’t code something that works for all three versions. This is annoying. I was hoping to just replace ElementTree with the C version, makes my code run in one third the time (the XML parts of it run in one tenth the time). And cannot install any compiled modules – the code can only rely on Python 2.5’s standard library.
[celementtree]: http://effbot.org/zone/celementtree.htm
[elementtree]: http://effbot.org/zone/element-index.htm
[xpath]: http://www.w3.org/TR/xpath/
[selfnode]: http://www.w3.org/TR/xpath/#path-abbrev