From: | Gregory Maxwell <gmaxwell(at)gmail(dot)com> |
---|---|
To: | Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> |
Cc: | Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>, Peter Eisentraut <peter_e(at)gmx(dot)net>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Upcoming PG re-releases |
Date: | 2005-12-08 22:54:35 |
Message-ID: | e692861c0512081454u560e6cc2h41f09293d4e5f2d1@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-www |
On 12/8/05, Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> wrote:
> > A script which identifies non-utf-8 characters and provides some
> > context, line numbers, etc, will greatly speed up the process of
> > remedying the situation.
>
> I think the best we can do is the "iconv -c with the diff" idea, which
> is already in the release notes. I suppose we could merge the iconv and
> diff into a single command, but I don't see a portable way to output the
> iconv output to stdout., /dev/stdin not being portable.
No, what is needed for people who care about fixing their data is a
loadable strip_invalid_utf8() that works in older versions.. then just
select * from bar where foo != strip_invalid_utf8(foo); The function
would be useful in general, for example, if you have an application
which doesn't already have much utf8 logic, you want to use a text
field, and stripping is the behaviour you want. For example, lots of
simple web applications.
From | Date | Subject | |
---|---|---|---|
Next Message | Jim C. Nasby | 2005-12-08 23:03:21 | Re: Improving free space usage (was: Reducing relation locking overhead) |
Previous Message | Simon Riggs | 2005-12-08 22:47:08 | Re: [PATCHES] Inherited Constraints |
From | Date | Subject | |
---|---|---|---|
Next Message | Martijn van Oosterhout | 2005-12-09 16:17:33 | Re: Upcoming PG re-releases |
Previous Message | Bruce Momjian | 2005-12-08 22:44:34 | Re: Upcoming PG re-releases |