From: | Gregory Maxwell <gmaxwell(at)gmail(dot)com> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Upcoming PG re-releases |
Date: | 2005-12-04 17:19:32 |
Message-ID: | e692861c0512040919x56c7b18fva497a198e4195707@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-www |
On 12/4/05, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Paul Lindner <lindner(at)inuus(dot)com> writes:
> > On Sun, Dec 04, 2005 at 11:34:16AM -0500, Tom Lane wrote:
> >> Paul Lindner <lindner(at)inuus(dot)com> writes:
> >>> iconv -c -f UTF8 -t UTF8 -o fixed.sql dump.sql
> >>
> >> Is that really a one-size-fits-all solution? Especially with -c?
>
> > I'd say yes, and the -c flag is needed so iconv strips out the
> > invalid characters.
>
> That's exactly what's bothering me about it. If we recommend that
> we had better put a large THIS WILL DESTROY YOUR DATA warning first.
> The problem is that the data is not "invalid" from the user's point
> of view --- more likely, it's in some non-UTF8 encoding --- and so
> just throwing away some of the characters is unlikely to make people
> happy.
Nor is it even guarenteed to make the data load: If the column is
unique constrained and the removal of the non-UTF characters makes two
rows have the same data where they didn't before...
The way to preserve the data is to switch the column to be a bytea.
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2005-12-04 18:32:56 | Re: Reducing relation locking overhead |
Previous Message | Kevin Brown | 2005-12-04 17:13:28 | Re: Reducing relation locking overhead |
From | Date | Subject | |
---|---|---|---|
Next Message | Martijn van Oosterhout | 2005-12-04 18:55:05 | Re: Upcoming PG re-releases |
Previous Message | Tom Lane | 2005-12-04 16:52:45 | Re: Upcoming PG re-releases |