From: | "John D(dot) Burger" <john(at)mitre(dot)org> |
---|---|
To: | PostgreSQL-general general <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Queries with Regular Expressions |
Date: | 2006-04-06 20:13:01 |
Message-ID: | eba60a0256eed477b6be464374a7594c@mitre.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
> But I just can't make it work correctly using brackets:
> SELECT field FROM table WHERE field ~* 'ch[aã]o';
>
> It just returns tuples that have 'chao', but not 'chão'.
>
> My queries are utf-8 an the database is SQL_ASCII.
I suspect the bracketed expression is turning into [aXY], where XY is
the two-byte sequence corresponding to ã in UTF8. So the regular
expression is only going to match strings of the form chao, chXo and
chYo. To make sure that this is what's happening, try this:
select length('ã');
I bet you get back 2, not 1. I don't know if a UTF8 database will
handle this correctly or not. The safest thing to do may be to use
queries like this:
SELECT field FROM table WHERE field ~* 'ch(a|ã)o';
- John D. Burger
MITRE
From | Date | Subject | |
---|---|---|---|
Next Message | Scott Ribe | 2006-04-06 20:26:27 | Re: "Upcalls" (sort of) from the database |
Previous Message | David Gama Rodríguez | 2006-04-06 20:10:47 | %Re: % tsearch gendict |