From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Frans <frans(at)geodan(dot)nl> |
Cc: | pgsql-bugs(at)postgresql(dot)org |
Subject: | Re: PostgreSQL 8.3.7: soundex function returns UTF-16 characters |
Date: | 2009-04-06 15:34:30 |
Message-ID: | 6392.1239032070@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Frans <frans(at)geodan(dot)nl> writes:
> We have just discovered a problem with the soundex function in
> PostgreSQL 8.3.7. The problem is easy to reproduce. The following query
> returns the ASCII code of the soundex representation of the Greek letter Pi:
> select ascii (soundex(''));
> In PostgreSQL 8.2.6 the result would be 0 (character null). In
> PostgreSQL 8.3.7 the return value is 944, which is the UTF-16 code of
> this letter.
Hm, I take it you are working in database encoding utf8? The
fuzzystrmatch module doesn't really work with utf8 (nor any other
multibyte encoding), because it depends on the <ctype.h> functions.
What you'll probably get when applying it to non-ascii utf8 is
an invalidly encoded string.
This is a known limitation that probably should be better documented.
It was just as broken in 8.2 (and every previous version), though.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Frans | 2009-04-06 16:10:44 | Re: PostgreSQL 8.3.7: soundex function returns UTF-16 characters |
Previous Message | Frans | 2009-04-06 15:07:38 | PostgreSQL 8.3.7: soundex function returns UTF-16 characters |