Quick Links

Upcoming PG re-releases

From:	Gregory Maxwell <gmaxwell(at)gmail(dot)com>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Upcoming PG re-releases
Date:	2005-12-04 17:19:32
Message-ID:	e692861c0512040919x56c7b18fva497a198e4195707@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers pgsql-www

On 12/4/05, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Paul Lindner <lindner(at)inuus(dot)com> writes:
> > On Sun, Dec 04, 2005 at 11:34:16AM -0500, Tom Lane wrote:
> >> Paul Lindner <lindner(at)inuus(dot)com> writes:
> >>> iconv -c -f UTF8 -t UTF8 -o fixed.sql dump.sql
> >>
> >> Is that really a one-size-fits-all solution? Especially with -c?
>
> > I'd say yes, and the -c flag is needed so iconv strips out the
> > invalid characters.
>
> That's exactly what's bothering me about it. If we recommend that
> we had better put a large THIS WILL DESTROY YOUR DATA warning first.
> The problem is that the data is not "invalid" from the user's point
> of view --- more likely, it's in some non-UTF8 encoding --- and so
> just throwing away some of the characters is unlikely to make people
> happy.

Nor is it even guarenteed to make the data load: If the column is
unique constrained and the removal of the non-UTF characters makes two
rows have the same data where they didn't before...

The way to preserve the data is to switch the column to be a bytea.

In response to

Re: Upcoming PG re-releases at 2005-12-04 16:52:45 from Tom Lane

Responses

Re: Upcoming PG re-releases at 2005-12-04 18:55:05 from Martijn van Oosterhout

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2005-12-04 18:32:56	Re: Reducing relation locking overhead
Previous Message	Kevin Brown	2005-12-04 17:13:28	Re: Reducing relation locking overhead

Browse pgsql-www by date

	From	Date	Subject
Next Message	Martijn van Oosterhout	2005-12-04 18:55:05	Re: Upcoming PG re-releases
Previous Message	Tom Lane	2005-12-04 16:52:45	Re: Upcoming PG re-releases