Re: New "single" COPY format

From: "Joel Jacobson" <joel(at)compiler(dot)org>
To: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, "jian he" <jian(dot)universality(at)gmail(dot)com>
Cc: "PostgreSQL Hackers" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: New "single" COPY format
Date: 2024-11-10 07:48:56
Message-ID: 1d5c40cf-eb6f-4bb1-88fe-0226257f0e32@app.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Nov 10, 2024, at 08:32, Joel Jacobson wrote:
> Option A:
> COPY TO: Empty string field gets exported as an empty line. NULL field
> is an error.
> COPY FROM: Empty line is imported as an empty string.
>
> Option B:
> COPY TO: NULL field gets exported as an empty line. Empty string field
> is an error.
> COPY FROM: Empty line is imported as a NULL value.
>
> I think Option A seems more useful, because:
>
> 1) Arbitrary text files, very often contain empty lines to separate
> sections from each other.
> 2) JSONL cannot contain empty lines, they are an error:
> https://jsonlines.org/validator/

David, I forgot about your NUL idea, so there is also a third option.

Option C:
COPY TO: NULL field gets exported as a the NUL byte. Empty string field is an empty line.
COPY FROM: Empty line is imported as an empty string. NUL byte is imported as a NULL value.

For arbitrary text files, Option C would work fine, since they usually don't contain NUL bytes, and if they do, then it seems useful to be handle to deal with such files in some way, even if it can't be known NUL always means NULL, then we could at least import such files, and then do some post-processing of the imported data, to get the desired result.

For JSONL, Option C would also work fine, since they can't contain NUL bytes.

It's a bit of a hack, but I kinda like it., since it seems like the only option without an error situation.
Maybe OK if we add one a cautionary <note> the docs?

/Joel

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joel Jacobson 2024-11-10 08:00:44 Re: New "single" COPY format
Previous Message Joel Jacobson 2024-11-10 07:32:30 Re: New "single" COPY format