From: | Indra Heckenbach <indra(at)macnica(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: unexpected query behavior with UTF text |
Date: | 2003-10-23 02:57:49 |
Message-ID: | 3F9743AD.5090209@macnica.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=ISO-8859-1">
<title></title>
</head>
<body text="#000000" bgcolor="#ffffff">
Hi Tom,<br>
<br>
I solved the problem by doing<br>
<br>
initdb --locale=ja_JP.utf8<br>
<br>
Unfortunately,<br>
<br>
initdb --locale=en_US.utf8<br>
<br>
does not work. Do you have any idea why? I would think we should be
able to test for equality in any locale.<br>
<br>
thanks,<br>
Indra<br>
<br>
<br>
<br>
Tom Lane wrote:<br>
<blockquote type="cite" cite="mid11056(dot)1066831136(at)sss(dot)pgh(dot)pa(dot)us">
<pre wrap="">Indra Heckenbach <a class="moz-txt-link-rfc2396E" href="mailto:indra(at)macnica(dot)com"><indra(at)macnica(dot)com></a> writes:
</pre>
<blockquote type="cite">
<pre wrap="">I have recently come across an unusual behavior with Postgres 7.3.4 on a
Linux RH 9 system. My database has encoding set to "UNICODE", and the
table includes Japanese text. I'm trying to issue a query like this:
</pre>
</blockquote>
<pre wrap=""><!---->
</pre>
<blockquote type="cite">
<pre wrap="">SELECT * FROM sales WHERE name='ja-text';
</pre>
</blockquote>
<pre wrap=""><!---->
</pre>
<blockquote type="cite">
<pre wrap="">This query ignores all japanese characters in the comparison text. It
matches properly on ascii chars, but skips right over ja chars.
</pre>
</blockquote>
<pre wrap=""><!---->
Text = depends on strcoll(), which is locale-sensitive. It sure appears
that your locale is designed to ignore japanese characters :-(
</pre>
<blockquote type="cite">
<pre wrap="">I found a related issue on the mailing list, where locale setting was
causing something similar. However, my locale is set to "en_US.UTF-8",
which is the solution proposed to the other problem.
</pre>
</blockquote>
<pre wrap=""><!---->
We have heard before that RH9's default locale setting is seriously
broken. This seems to be additional evidence for that opinion. I'd
recommend re-initdb'ing in locale C.
Also, you say "your locale", but how certain are you that that is the
database's locale, and not just the one in your own user environment?
It'd be a good idea to use pg_controldata to check the database settings.
regards, tom lane
</pre>
</blockquote>
<br>
</body>
</html>
Attachment | Content-Type | Size |
---|---|---|
unknown_filename | text/html | 2.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Oliver Elphick | 2003-10-23 05:24:17 | Re: PostgreSQL v7.4 Beta5 Available for Testing |
Previous Message | Marc G. Fournier | 2003-10-23 00:49:33 | PostgreSQL v7.4 Beta5 Available for Testing |