| From: | Bruce Momjian <bruce(at)momjian(dot)us> |
|---|---|
| To: | Michael Fuhr <mike(at)fuhr(dot)org> |
| Cc: | Michael Artz <mlartz(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org |
| Subject: | Re: encode, lower and 0x8a |
| Date: | 2007-02-20 20:50:19 |
| Message-ID: | 200702202050.l1KKoJf24989@momjian.us |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-general |
I have updated the encode() documentation to not mention "ASCII", and to
be more specific about what 'escape' does. Backpatched to 8.2.X.
---------------------------------------------------------------------------
Michael Fuhr wrote:
> On Thu, Jan 25, 2007 at 02:28:38PM -0500, Michael Artz wrote:
> > Perhaps my understanding of the 'encode' function is incorrect, but I
> > was under the impression that I could do something like:
> >
> > SELECT lower(encode(bytes, 'escape')) FROM mytable;
> >
> > as it sounded like (from the manual) that 'encode' would return valid
> > ASCII, with all the non-ascii bytes hex escaped.
>
> The documentation for encode() does give that impression: "Encode
> binary string to ASCII-only representation. Supported types are:
> base64, hex, escape." However, the source code for esc_encode()
> in src/backend/utils/adt/encode.c says and does otherwise:
>
> * Only two characters are escaped:
> * \0 (null) and \\ (backslash)
>
> > When I have the byte 0x8a, however, I get the error:
> >
> > ERROR: invalid byte sequence for encoding "UTF8": 0x8a
>
> Since encode() returns text and doesn't escape non-ASCII characters,
> all of the original binary data will be treated as though it's text
> in the database's encoding. If the data contains byte sequences
> that aren't valid in that encoding then you get the above error.
>
> > I have the sneaking suspicion that I am missing something, so please
> > correct me if I am wrong. If I am wrong, is there a better way to
> > lowercase all the ascii characters in a bytea string?
>
> What are you trying to do? What is the binary data and why are you
> treating it (or part of it) as though it's text? Do you want the
> end result to be text with escape sequences or do you want to convert
> it back to bytea?
>
> Something like this might work:
>
> SELECT lower(textin(byteaout(bytes))) FROM mytable;
>
> To turn the result back into bytea:
>
> SELECT decode(lower(textin(byteaout(bytes))), 'escape') FROM mytable;
>
> --
> Michael Fuhr
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faq
--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
| Attachment | Content-Type | Size |
|---|---|---|
| /rtmp/diff | text/x-diff | 1.2 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Andrej Ricnik-Bay | 2007-02-20 20:51:51 | Re: Database performance comparison paper. |
| Previous Message | Andrej Ricnik-Bay | 2007-02-20 20:36:13 | Re: Syncing postgres data with Pocket PC |