Re: New "raw" COPY format

From: "Joel Jacobson" <joel(at)compiler(dot)org>
To: "Tatsuo Ishii" <ishii(at)postgresql(dot)org>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: New "raw" COPY format
Date: 2024-10-15 07:54:42
Message-ID: 4bcd0fbd-5dfa-45f6-bddc-53eaf060296f@app.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Oct 15, 2024, at 03:35, Joel Jacobson wrote:
> On Mon, Oct 14, 2024, at 21:59, Joel Jacobson wrote:
>> On Mon, Oct 14, 2024, at 10:51, Joel Jacobson wrote:
>>> On Mon, Oct 14, 2024, at 10:07, Joel Jacobson wrote:
>>>> Attached is a first draft implementation of the new proposed COPY "raw" format.
>>>>
>>>> The first two patches are just the bug fix in HEAD, reported separately:
>>>> https://commitfest.postgresql.org/50/5297/
...
> Sorry about the noise. I'm not running the full test suite,
> with tap and `meson test --num-processes 32`,
> so hopefully I won't cause cfbot failures as often any longer.

Ops, that should have said:
"Sorry about the noise. I'm *now* running the full test suite"

However, I see Windows still failed on copy2.sql,
and I think the reason could be the use of \qecho -n
to create files with inconsistent newline style, e.g.:

\o :filename
\qecho -n line1
\qecho -n '\n'
\qecho -n line2
\qecho -n '\r\n'
\o
COPY copy_raw_test_errors (col1) FROM :'filename' (FORMAT raw);

Maybe Windows automatically translates \n into \r\n, and vice versa?
If so, this would explain why this test failed on Windows.

Btw, anyone know if it's possible to download the "regression.diffs" file from a the Ci task?

I've downloaded all the crashlog, meason_log, testrun zip files from

https://cirrus-ci.com/task/4564405273231360

but none of these contained the "regression.diffs" mentioned here:

[02:09:42.431] # The differences that caused some tests to fail can be viewed in the file
"C:/cirrus/build/testrun/regress/regress/regression.diffs".

Anyhow, I think I've fixed the problem now, in a cross-platform safe way,
by shipping src/test/regress/data/newline*.data files:

newlines_cr.data
newlines_cr_lr.data
newlines_cr_lr_nolast.data
newlines_cr_nolast.data
newlines_lr.data
newlines_lr_nolast.data
newlines_mixed_1.data
newlines_mixed_2.data
newlines_mixed_3.data
newlines_mixed_4.data
newlines_mixed_5.data

These are then used in copy.sql and copy2.sql, e.g.:

copy.sql:

\set filename :abs_srcdir '/data/newlines_lr.data'
TRUNCATE copy_raw_test;
COPY copy_raw_test (col) FROM :'filename' (FORMAT raw);
SELECT col, col IS NULL FROM copy_raw_test ORDER BY id;

copy2.sql:

-- Test inconsistent newline style
\set filename :abs_srcdir '/data/newlines_mixed_1.data'
COPY copy_raw_test_errors (col1) FROM :'filename' (FORMAT raw);

Attaching new version. It's only patch 0016 that has been updated.

/Joel

Attachment Content-Type Size
v8-0001-Fix-thinko-in-tests-for-COPY-options-force_not_null-.patch application/octet-stream 4.6 KB
v8-0002-Fix-validation-of-FORCE_NOT_NULL-FORCE_NULL-for-all-.patch application/octet-stream 5.0 KB
v8-0003-Replace-binary-flags-binary-and-csv_mode-with-format.patch application/octet-stream 18.6 KB
v8-0004-Set-default-format-if-not-specified.patch application/octet-stream 929 bytes
v8-0005-Separate-DELIMITER-and-NULL-option-validation-into-t.patch application/octet-stream 9.2 KB
v8-0006-Separate-QUOTE-option-validation-into-its-own-sectio.patch application/octet-stream 2.4 KB
v8-0007-Separate-ESCAPE-option-validation-into-its-own-secti.patch application/octet-stream 2.5 KB
v8-0008-Separate-DEFAULT-option-validation-into-its-own-sect.patch application/octet-stream 5.0 KB
v8-0009-Separate-HEADER-option-validation-into-its-own-secti.patch application/octet-stream 1.5 KB
v8-0010-Separate-FORCE_QUOTE-option-validation-into-its-own-.patch application/octet-stream 2.2 KB
v8-0011-Separate-FORCE_NOT_NULL-option-validation-into-its-o.patch application/octet-stream 2.2 KB
v8-0012-Separate-FORCE_NULL-option-validation-into-its-own-s.patch application/octet-stream 2.2 KB
v8-0013-Separate-FREEZE-option-validation-into-its-own-secti.patch application/octet-stream 1.6 KB
v8-0014-Separate-ON_ERROR-option-validation-into-its-own-sec.patch application/octet-stream 1.4 KB
v8-0015-Separate-REJECT_LIMIT-option-validation-into-its-own.patch application/octet-stream 2.0 KB
v8-0016-Add-raw-COPY-format-support-for-unstructured-text-da.patch application/octet-stream 43.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2024-10-15 08:08:54 Re: type cache cleanup improvements
Previous Message Benoit Lobréau 2024-10-15 07:52:20 Re: Logging parallel worker draught