From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | Joel Jacobson <joel(at)compiler(dot)org> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: New "single" COPY format |
Date: | 2024-11-07 23:13:32 |
Message-ID: | CAD21AoA_cSj3Fen+fiWvW1YWHpCqroMLCRszBcuXkTr44pY+-g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On Thu, Nov 7, 2024 at 8:16 AM Joel Jacobson <joel(at)compiler(dot)org> wrote:
>
> Hi hackers,
>
> Thread [1] renamed, since the format name has now been changed from 'raw' to
> 'single', as suggested by Andrew Dunstan and Jacob Champion.
>
> [1] https://postgr.es/m/c12516b1-77dc-4ad3-94a7-88527360aee0@app.fastmail.com
>
> Recap: This is about adding support to import/export text-based formats such as
> JSONL, or any unstructured text file, where wanting to import each line "as is"
> into a single column, or wanting to export a single column to a text file.
>
> Example importing the meson-logs/testlog.json file Meson generates
> when building PostgreSQL, which is in JSONL format:
>
> # create table meson_log (log_line jsonb);
> # \copy meson_log from meson-logs/testlog.json (format single);
> COPY 306
> # select log_line->'name' name, log_line->'result' result from meson_log limit 3;
> name | result
> -----------------------------------------+--------
> "postgresql:setup / tmp_install" | "OK"
> "postgresql:setup / install_test_files" | "OK"
> "postgresql:setup / initdb_cache" | "OK"
> (3 rows)
>
> Changes since v16:
>
> * EOL handling now works the same as for 'text' and 'csv'.
> In v16, we supported multi-byte delimiters to allow specifying
> e.g. Windows EOL (\r\n), but this seemed unnecessary, if we just do what we do
> for text/csv, that is, to auto-detect the EOL for COPY FROM, and use
> the OS default EOL for COPY TO.
> The DELIMITER option is therefore invalid for the 'single' format.
> This is the biggest change in the code, between v16 and v18.
> CopyReadLineRawText() has been renamed to CopyReadLineSingleText(),
> and changed accordingly.
In earlier versions, we supported loading the whole file into a single
tuple. Is there any reason that it doesn't support it in v18? I think
if it's useful we can improve it in a separate patch.
>
> * A final EOL is now emitted to the last record in COPY TO.
> So now it works just like 'text' and 'csv'.
>
+1
> * HEADER [ boolean | MATCH ] now supported
> This is now again supported, as previously suggested by Daniel Verite,
> possible thanks to the EOL handling.
It makes sense to support it.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2024-11-08 00:01:20 | Re: general purpose array_sort |
Previous Message | Andres Freund | 2024-11-07 22:56:01 | Re: [PATCH] pg_stat_activity: make slow/hanging authentication more visible |