From: | Michael Fuhr <mike(at)fuhr(dot)org> |
---|---|
To: | Chris Hoover <revoohc(at)gmail(dot)com> |
Cc: | "pgsql-admin(at)postgresql(dot)org Admin" <pgsql-admin(at)postgresql(dot)org> |
Subject: | Re: Help with High value unicode characters |
Date: | 2007-08-08 07:57:37 |
Message-ID: | 20070808075737.GA46023@winnie.fuhr.org |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
On Tue, Aug 07, 2007 at 05:09:35PM -0400, Chris Hoover wrote:
> We need some help, we have some what we believe are high value unicode
> characters (Unicode 0x2).
What do you mean by "high value unicode characters (Unicode 0x2)"?
Characters with code points in a plane other than Plane 0 (BMP,
Basic Multilingual Plane), i.e., with a code point greater than
U+FFFF?
> How can you search and replace for these? We are storing this data
> in a text field, and having the data contain this unicode value is
> violating our xml rules the application uses and causing abends in
> our application.
If I understand what you're asking then you should be able to use
regexp_replace (8.1 and later) to fix the data. Example:
UPDATE tablename
SET columnname = regexp_replace(columnname, E'[\\U00010000-\\U0010FFFF]+', '', 'g')
WHERE columnname ~ E'[\\U00010000-\\U0010FFFF]';
If that doesn't help then please clarify the problem.
--
Michael Fuhr
From | Date | Subject | |
---|---|---|---|
Next Message | pingu.freak | 2007-08-08 09:38:11 | Re: Transaction-Overflow |
Previous Message | Tom Lane | 2007-08-08 04:31:21 | Re: postgres authentication |