Quick Links

Re: utf8 COPY DELIMITER?

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc:	Mark Dilger <pgsql(at)markdilger(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: utf8 COPY DELIMITER?
Date:	2007-04-17 18:28:18
Message-ID:	4129.1176834498@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> Mark Dilger wrote:
>> I'm working on fixing bugs relating to multibyte character encodings.
>> I wasn't sure whether this was a bug or not. I don't think we should
>> use the phrasing "COPY delimiter must be a single character" when, in
>> utf8 land, I did in fact use a single character. We might say "a
>> single byte", or we might extend the functionality to handle multibyte
>> characters.

> Doing the latter would be a feature, and so is of course right off the
> table for this release. Changing the error messages to be clearer should
> be fine.

+1 on changing the message: "character" is clearly less correct than "byte"
here.

I doubt that supporting a single multibyte character would be an
interesting extension --- if we wanted to do anything at all there, we'd
just generalize the delimiter to be an arbitrary string. But it would
certainly slow down COPY by some amount, which is an area where you'll
get push-back for performance losses, so you'd need to make a convincing
use-case for it.

regards, tom lane

In response to

Re: utf8 COPY DELIMITER? at 2007-04-17 17:37:58 from Andrew Dunstan

Responses

Re: utf8 COPY DELIMITER? at 2007-04-18 16:38:06 from Jim C. Nasby

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2007-04-17 18:33:40	Re: utf8 COPY DELIMITER?
Previous Message	Andrew Dunstan	2007-04-17 17:37:58	Re: utf8 COPY DELIMITER?