From: | Andrew Dunstan <andrew(at)dunslane(dot)net> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Oleg Bartunov <oleg(at)sai(dot)msu(dot)su> |
Subject: | Re: fulltext parser strange behave |
Date: | 2007-11-13 19:42:18 |
Message-ID: | 4739FE1A.3090508@dunslane.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-patches |
Tom Lane wrote:
> Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
>
>> I've just been looking at the state machine in wparser_def.c. I think
>> the processing for entities is also a few bob short in the pound. It
>> recognises decimal numeric character references, but nor hexadecimal
>> numeric character references. That's fairly silly since the HTML spec
>> specifically says the latter are "particularly useful". The rules for
>> named entities are also deficient w.r.t. digits, just like the case of
>> tags that Tom noticed. This isn't academic: HTML features a number of
>> named entities with digits in the name (sup2, frac14 for example).
>>
>
>
>> In XML at least, legal names are defined by the following rules from the
>> spec:
>> ...
>> [A-Za-z:_][A-Za-z0-9:_.-]*
>>
>
>
>> I suggest we use that or something very close to it as the rule for
>> names in these patterns.
>>
>
> No objections here. Who wants to patch wparser_def?
>
>
>
I can get to it some time in the next week. - rather snowed under right now.
BTW, I'm also suspicious of the clause that allows <?xml ... it appears
that it will allow <?xfoo and <?XFOO also, which seems quite odd,
especially the latter.
cheers
andrew
From | Date | Subject | |
---|---|---|---|
Next Message | Andrew Sullivan | 2007-11-13 20:05:08 | Re: How to keep a table in memory? |
Previous Message | Greg Smith | 2007-11-13 19:36:14 | Re: How to keep a table in memory? |
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2007-11-13 21:14:55 | Re: tsearch2api .. wrapper for integrated fultext |
Previous Message | Peter Eisentraut | 2007-11-13 11:44:28 | Re: Hibernate Dialects for PostgreSQL |