Re: New "single" COPY format

From: "Joel Jacobson" <joel(at)compiler(dot)org>
To: "jian he" <jian(dot)universality(at)gmail(dot)com>
Cc: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, "PostgreSQL Hackers" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: New "single" COPY format
Date: 2024-11-10 07:05:54
Message-ID: 87b6f33d-495c-44dd-b51d-543b7a99d2a9@app.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Nov 10, 2024, at 05:26, jian he wrote:
> On Sun, Nov 10, 2024 at 3:29 AM Joel Jacobson <joel(at)compiler(dot)org> wrote:
>>
>> Cool. I've drafted a new patch on this approach.
>> The list of newline-free built-in types is not exhaustive, yet.
>
>
> do we care that COPY back and forth always work?

Yes, I think that's an important design goal.

> doc not mentioned, but seems it's an implicit idea.

True, docs should be clear on this. Will update the docs
when we've decided what to do, see below.

> copy the_table to '/tmp/3.txt' with (format whatever_format);
> truncate the_table;
> copy the_table from '/tmp/3.txt' with (format whatever_format);
>
> but v20, will not work for an non-text column with SQL NULL data in it.
>
> example:
> drop table if exists x1;
> create table x1(a int);
> insert into x1 select null;
> copy x1 to '/tmp/3.txt' with (format list);
> copy x1 from '/tmp/3.txt' with (format list);
> ERROR: invalid input syntax for type integer: ""
> CONTEXT: COPY x1, line 1, column a: ""
>
> <para>
> The <literal>list</literal> format does not distinguish a
> <literal>NULL</literal>
> value from an empty string. Empty lines are imported as empty strings, not
> as <literal>NULL</literal> values.
> </para>
> we only mentioned import, not export (COPY TO) dealing with
> NULL value.

Nice catch.
Will respond to this in the later message in the thread from David.

> + if (c == '\n' || c == '\r')
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
> + errmsg("list format doesn't support newlines in field values"),
> + errhint("Consider using csv or text format for data containing newlines.")));
>
> "list format doesn't support newlines in field values"
> word list need single or double quote?

Fixed, to match other code.

I also change to \"list\" instead of 'list' everywhere in error messages,
since that seems much more popular in other existing PostgreSQL code.

Also changed wording to match the other error messages better:
- errmsg("list format doesn't support newlines in field values"),
+ errmsg("COPY with format \"list\" doesn't support newlines in field values")));

Also removed this errhint since it seemed unnecessary.
- errhint("Consider using csv or text format for data containing newlines.")));

> ereport(ERROR,
> (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
> errmsg("Unsupported COPY format")));
> should be "unsupported" per
> https://www.postgresql.org/docs/current/error-style-guide.html#ERROR-STYLE-GUIDE-CASE

Fixed.

/Joel

Attachment Content-Type Size
v21-0001-Introduce-CopyFormat-and-replace-csv_mode-and-binary.patch application/octet-stream 18.8 KB
v21-0002-Add-COPY-format-list.patch application/octet-stream 34.9 KB
v21-0003-Reorganize-option-validations.patch application/octet-stream 19.7 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Joel Jacobson 2024-11-10 07:32:30 Re: New "single" COPY format
Previous Message Junwang Zhao 2024-11-10 05:47:28 Re: How to get started with contribution