Re: Emitting JSON to file using COPY TO

From: Joe Conway <mail(at)joeconway(dot)com>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>, Davin Shearer <davin(at)apache(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>
Subject: Re: Emitting JSON to file using COPY TO
Date: 2023-12-03 16:03:14
Message-ID: 8c88da85-c197-4765-96f8-a9a1c78305a6@joeconway.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

On 12/3/23 10:10, Andrew Dunstan wrote:
>
> On 2023-12-01 Fr 14:28, Joe Conway wrote:
>> On 11/29/23 10:32, Davin Shearer wrote:
>>> Thanks for the responses everyone.
>>>
>>> I worked around the issue using the `psql -tc` method as Filip
>>> described.
>>>
>>> I think it would be great to support writing JSON using COPY TO at
>>> some point so I can emit JSON to files using a PostgreSQL function
>>> directly.
>>>
>>> -Davin
>>>
>>> On Tue, Nov 28, 2023 at 2:36 AM Filip Sedlák <filip(at)sedlakovi(dot)org
>>> <mailto:filip(at)sedlakovi(dot)org>> wrote:
>>>
>>>     This would be a very special case for COPY. It applies only to a
>>> single
>>>     column of JSON values. The original problem can be solved with psql
>>>     --tuples-only as David wrote earlier.
>>>
>>>
>>>     $ psql -tc 'select json_agg(row_to_json(t))
>>>                    from (select * from public.tbl_json_test) t;'
>>>
>>>        [{"id":1,"t_test":"here's a \"string\""}]
>>>
>>>
>>>     Special-casing any encoding/escaping scheme leads to bugs and harder
>>>     parsing.
>>
>> (moved to hackers)
>>
>> I did a quick PoC patch (attached) -- if there interest and no hard
>> objections I would like to get it up to speed for the January commitfest.
>>
>> Currently the patch lacks documentation and regression test support.
>>
>> Questions:
>> ----------
>> 1. Is supporting JSON array format sufficient, or does it need to
>> support some other options? How flexible does the support scheme need
>> to be?
>>
>> 2. This only supports COPY TO and we would undoubtedly want to support
>> COPY FROM for JSON as well, but is that required from the start?
>>
>> Thanks for any feedback.
>
> I  realize this is just a POC, but I'd prefer to see composite_to_json()
> not exposed. You could use the already public datum_to_json() instead,
> passing JSONTYPE_COMPOSITE and F_RECORD_OUT as the second and third
> arguments.

Ok, thanks, will do

> I think JSON array format is sufficient.

The other formats make sense from a completeness standpoint (versus
other databases) and the latest patch already includes them, so I still
lean toward supporting all three formats.

> I can see both sides of the COPY FROM argument, but I think insisting on
> that makes this less doable for release 17. On balance I would stick to
> COPY TO for now.

WFM.

From your earlier post, regarding constructing the aggregate -- not
extensive testing but one data point:
8<--------------------------
test=# copy foo to '/tmp/buf' (format json, force_array);
COPY 10000000
Time: 36353.153 ms (00:36.353)
test=# copy (select json_agg(foo) from foo) to '/tmp/buf';
COPY 1
Time: 46835.238 ms (00:46.835)
8<--------------------------

--
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Joe Conway 2023-12-03 17:11:34 Re: Emitting JSON to file using COPY TO
Previous Message Joe Conway 2023-12-03 15:51:12 Re: Emitting JSON to file using COPY TO

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2023-12-03 16:14:15 Re: More new SQL/JSON item methods
Previous Message Joe Conway 2023-12-03 15:51:12 Re: Emitting JSON to file using COPY TO