From: | Joe Conway <mail(at)joeconway(dot)com> |
---|---|
To: | Davin Shearer <davin(at)apache(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Cc: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
Subject: | Re: Emitting JSON to file using COPY TO |
Date: | 2023-12-02 03:10:54 |
Message-ID: | a362ce61-8646-489e-a374-e8301a385b02@joeconway.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general pgsql-hackers |
On 12/1/23 22:00, Davin Shearer wrote:
> I'm really glad to see this taken up as a possible new feature and will
> definitely use it if it gets released. I'm impressed with how clean,
> understandable, and approachable the postgres codebase is in general and
> how easy it is to read and understand this patch.
>
> I reviewed the patch (though I didn't build and test the code) and have
> a concern with adding the '[' at the beginning and ']' at the end of the
> json output. Those are already added by `json_agg`
> (https://www.postgresql.org/docs/current/functions-aggregate.html
> <https://www.postgresql.org/docs/current/functions-aggregate.html>) as
> you can see in my initial email. Adding them in the COPY TO may be
> redundant (e.g., [[{"key":"value"...}....]]).
With this patch in place you don't use json_agg() at all. See the
example output (this is real output with the patch applied):
(oops -- I meant to send this with the same email as the patch)
8<-------------------------------------------------
create table foo(id int8, f1 text, f2 timestamptz);
insert into foo
select g.i,
'line: ' || g.i::text,
clock_timestamp()
from generate_series(1,4) as g(i);
copy foo to stdout (format 'json');
[
{"id":1,"f1":"line: 1","f2":"2023-12-01T12:58:16.776863-05:00"}
,{"id":2,"f1":"line: 2","f2":"2023-12-01T12:58:16.777084-05:00"}
,{"id":3,"f1":"line: 3","f2":"2023-12-01T12:58:16.777096-05:00"}
,{"id":4,"f1":"line: 4","f2":"2023-12-01T12:58:16.777103-05:00"}
]
8<-------------------------------------------------
> I think COPY TO makes good sense to support, though COPY FROM maybe not
> so much as JSON isn't necessarily flat and rectangular like CSV.
Yeah -- definitely not as straight forward but possibly we just support
the array-of-jsonobj-rows as input as well?
> For my use-case, I'm emitting JSON files to Apache NiFi for processing,
> and NiFi has superior handling of JSON (via JOLT parsers) versus CSV
> where parsing is generally done with regex. I want to be able to emit
> JSON using a postgres function and thus COPY TO.
>
> Definitely +1 for COPY TO.
>
> I don't think COPY FROM will work out well unless the JSON is required
> to be flat and rectangular. I would vote -1 to leave it out due to the
> necessary restrictions making it not generally useful.
--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Joe Conway | 2023-12-02 14:31:46 | Re: Emitting JSON to file using COPY TO |
Previous Message | Davin Shearer | 2023-12-02 03:00:29 | Re: Emitting JSON to file using COPY TO |
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2023-12-02 04:11:08 | Re: Proposal: Filter irrelevant change before reassemble transactions during logical decoding |
Previous Message | Lev Kokotov | 2023-12-02 03:06:40 | Re: Bug in pgbench prepared statements |