Re: COPY command character set

From: "Peter Headland" <pheadland(at)actuate(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: Re: COPY command character set
Date: 2009-09-10 17:52:18
Message-ID: 71F491F5DA99604A80DE49424BF3D02B0CD9A21C@exchange8.actuate.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

> The COPY command reference page saith
>
> Input data is interpreted according to the current client encoding,
> and output data is encoded in the the current client encoding, even
> if the data does not pass through the client but is read from or
> written to a file.

Rats - I read the manual page twice and that didn't register on my
feeble consciousness. I suspect that I didn't look beyond the word
"client", since I knew I wasn't interested in client behavior and I was
speed-reading. On the assumption that I am not uniquely stupid, maybe we
could re-phrase this slightly, with a "for example", and add a heading
"Localization"?

As a general comment, I18N/L10N is a hairy enough topic that it merits
its own heading in any commands where it is an issue.

How about my suggestion to add a means (extend COPY syntax) to specify
encoding explicitly and handle UTF lead bytes - would that be of
interest?

--
Peter Headland
Architect
Actuate Corporation

-----Original Message-----
From: Tom Lane [mailto:tgl(at)sss(dot)pgh(dot)pa(dot)us]
Sent: Thursday, September 10, 2009 10:38
To: Peter Headland
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: [GENERAL] COPY command character set

"Peter Headland" <pheadland(at)actuate(dot)com> writes:
>> set client_encoding = 'utf8';
>> copy from stdin/to stdout;

> What if I want to do this on the server side (because it's much, much
> faster)? Does COPY use the default encoding of the database? If not,
> what?

> If this is a restrictive as it appears, and there are no outstanding
> enhancements planned in this area, I might be interested in improving
> this command to allow specifying the encoding and to have it do
obvious
> stuff like recognize UTF lead bytes automatically. At the very least,
> the documentation needs some work to explain these subtleties.

The COPY command reference page saith

Input data is interpreted according to the current client encoding,
and output data is encoded in the the current client encoding, even
if the data does not pass through the client but is read from or
written to a file.

Seems clear enough to me.

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message David Brain 2009-09-10 18:03:13 Unable to drop a table due to seemingly non-existent dependencies
Previous Message Alban Hertroys 2009-09-10 17:40:06 Re: query speed question