From: | Robert Treat <xzilla(at)users(dot)sourceforge(dot)net> |
---|---|
To: | pgsql-admin(at)postgresql(dot)org, meetesh(dot)karia(at)alumni(dot)duke(dot)edu |
Subject: | Re: Encoding problems with migration from 8.0.14 to 8.3.0 on Windows |
Date: | 2008-03-13 03:28:42 |
Message-ID: | 200803122328.42683.xzilla@users.sourceforge.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin pgsql-hackers |
On Wednesday 12 March 2008 09:37, Meetesh Karia wrote:
> One quick addition to this:
>
> The column I'm creating this unique index on is a varchar(255) and the
> command I was running was:
>
> create unique index foo_name on foo (name);
>
> If I use the following, it now works:
>
> create unique index foo_name on foo (cast(name as bytea));
>
> Thoughts?
>
> Meetesh
>
> Meetesh Karia wrote:
> > Hi all,
> >
> > I'm trying to migrate from 8.0.14 on Windows (Vista Home Premium) to
> > 8.3.0 and I've been trying to solve what appears to be an encoding
> > problem. My old db was in the UNICODE encoding. I know that this
> > isn't supported on 8.0.x, but it was a restore of a db from a Linux
> > environment and postgres didn't appear to have any problems with it.
> >
> > My 8.3 server and client encodings are UTF8 and I used pg_dumpall (I
> > tried the 8.0 and 8.3 versions) to dump the db. However, when I tried
> > to restore the db, I got an error during index creation which wouldn't
> > let me create a unique index on a column that had all unique values
> > (it had the index in 8.0 and a group by having query with no indexes
> > on the table confirms uniqueness). The thing that this column does
> > have however is values like:
> >
> > 'Bruehl'
> > 'Brühl'
> >
> > I created a blank table with the unique index on it and inserted rows
> > one at a time until I confirmed that it was the above values that were
> > causing a problem. Running the following query shows the difference
> > in the hex encoded values (I changed my client encoding to WIN1250 to
> > get the below to show up correctly):
> >
> > select name, encode(decode(name, 'escape'), 'hex') from ...
> >
> > name | encode
> > ---------------+----------------------------
> > Daniel Brühl | 44616e69656c204272c3bc686c
> > Daniel Bruehl | 44616e69656c2042727565686c
> > (2 rows)
> >
> > I've also tried exporting using an encoding of WIN1250 but I get
> > errors like this:
> >
> > pg_dump: Error message from server: ERROR: character 0xc383 of
> > encoding "UNICODE" has no equivalent in "WIN1250"
> >
> > Anyone have any thoughts or suggestions? Why would the index creation
> > fail? Is there a workaround?
> >
I'm not convinced your problem isn't solved by proper setting of
client_encoding for both input and output:
pagila=# create table x (r varchar(255) unique);
NOTICE: CREATE TABLE / UNIQUE will create implicit index "x_r_key" for
table "x "
CREATE TABLE
pagila=# set client_encoding=WIN1250;
SET
pagila=# insert into x (r) values ('Daniel Brühl');
INSERT 0 1
pagila=# insert into x (r) values ('Daniel Bruehl');
INSERT 0 1
pagila=# select * from x;
r
---------------
Daniel Brühl
Daniel Bruehl
(2 rows)
--
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL
From | Date | Subject | |
---|---|---|---|
Next Message | Scott Marlowe | 2008-03-13 05:50:21 | Re: migration of 7.4 to 8.1 |
Previous Message | Robert Treat | 2008-03-13 03:17:59 | Re: migration of 7.4 to 8.1 |
From | Date | Subject | |
---|---|---|---|
Next Message | Webb Sprague | 2008-03-13 04:46:07 | Re: Ideas input sought for this year's SOC page |
Previous Message | Dann Corbit | 2008-03-13 01:05:55 | TIMESTAMP and daylight savings time question |