Re: New "raw" COPY format

From: "Joel Jacobson" <joel(at)compiler(dot)org>
To: "Jacob Champion" <jacob(dot)champion(at)enterprisedb(dot)com>
Cc: "Tatsuo Ishii" <ishii(at)postgresql(dot)org>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: New "raw" COPY format
Date: 2024-10-16 17:36:24
Message-ID: 62d6d19e-4af3-45bc-a1e7-03d8adf3c37c@app.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 16, 2024, at 18:04, Jacob Champion wrote:
> A hypothetical type whose text representation can contain '\r' but not
> '\n' still can't be unambiguously round-tripped under this scheme:
> COPY FROM will see the "mixed" line endings and complain, even though
> there's no ambiguity.

Yeah, that's quite an ugly limitation.

> Maybe no one will run into that problem in practice? But if they did,
> I think that'd be a pretty frustrating limitation. It'd be nice to
> override the behavior, to change it from "do what you think I mean" to
> "do what I say".

That would be nice.

>> That's an interesting idea that would provide more flexibility,
>> though, at the cost of complicating things by overloading the meaning
>> of DELIMITER.
>
> I think that'd be a docs issue rather than a conceptual one, though...
> it's still a delimiter. I wouldn't really expect end-user confusion.

Yeah, I meant the docs, but that's probably fine,
we could just add <note> to DELIMITER.

>> What I found appealing with the idea of a new COPY format,
>> was that instead of overloading the existing options
>> with more complexity, a new format wouldn't need to affect
>> the existing options, and the new format could be explained
>> separately, without making things worse for users not
>> using this format.
>
> I agree that we should not touch the existing formats. If
> RAW/SINGLE/whatever needed a multibyte line delimiter, I'm not
> proposing that the other formats should change.

Right, I didn't think you did either, I meant overloading the existing
options, from a docs perspective.

But I agree it's probably fine if we just overload DELIMITER in the docs,
that should be possible to explain in a pedagogic way,
without causing confusion.

/Joel

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2024-10-16 17:54:21 Re: ECPG cleanup and fix for clang compile-time problem
Previous Message Masahiko Sawada 2024-10-16 17:32:30 Re: Using per-transaction memory contexts for storing decoded tuples