The following bug has been logged online:
Bug reference: 5075
Logged by: Marek Lewczuk
Email address: marek(at)lewczuk(dot)com
PostgreSQL version: 8.4.0
Operating system: All
Description: Text Search parser does not identify xml tag when
attribute name's contains underscore
Details:
Please execute following example:
select * from ts_debug('english', '<img width="182" height="120"
align="right" style="margin: 0px 0px 5px 5px;" test_aa="26461"/>')
As the result you will see, that <img/> is not identified as XML tag, but
rather splitted as words, blank spaces etc. The reason for that is the fact,
that last attribute "test_aa" contains underscore in its name - when the
underscore is removed, then img tag is properly identified as XML tag.
XML definition allows using underscore in tag and attribute names.