| From: | jian he <jian(dot)universality(at)gmail(dot)com> | 
|---|---|
| To: | Joe Conway <mail(at)joeconway(dot)com> | 
| Cc: | "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>, Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Daniel Verite <daniel(at)manitou-mail(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Davin Shearer <davin(at)apache(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Re: Emitting JSON to file using COPY TO | 
| Date: | 2024-09-13 14:42:00 | 
| Message-ID: | CACJufxH=3PreKDHuoskn0GQmxUcw-n0k3ogN1CD4CvFbLBy-VA@mail.gmail.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-general pgsql-hackers | 
Hi.
in ExecutePlan
we have:
    for (;;)
    {
        ResetPerTupleExprContext(estate);
        slot = ExecProcNode(planstate);
        if (!TupIsNull(slot))
        {
            if((slot != NULL) && (slot->tts_tupleDescriptor != NULL)
            && (slot->tts_tupleDescriptor->natts > 0)
            && (slot->tts_tupleDescriptor->attrs->attname.data[0] == '\0'))
            elog(INFO, "%s:%d %s this slot first attribute attname is
null", __FILE_NAME__, __LINE__, __func__);
        }
        if (TupIsNull(slot))
            break;
        if (sendTuples)
        {
            if (!dest->receiveSlot(slot, dest))
                break;
        }
dest->receiveSlot(slot, dest) is responsible for sending values to destination,
for COPY TO it will call copy_dest_receive, CopyOneRowTo.
For the copy to format json, we need to make sure
in "dest->receiveSlot(slot, dest))", the slot->tts_tupleDescriptor has
proper information.
because we *use* slot->tts_tupleDescriptor->attrs->attname as the json key.
For example, if (slot->tts_tupleDescriptor->attrs->attname.data[0] == '\0')
then output json may look like:  {"":12}
which is not what we want.
in ExecutePlan i use
elog(INFO, "%s:%d %s this slot first attribute attname is null",
__FILE_NAME__, __LINE__, __func__);
to find sql queries that attribute name is not good.
based on that, i found out many COPY TO (FORMAT JSON) queries will either
error out or the output json key be empty string
if in CopyOneRowTo we didn't copy the cstate->queryDesc->tupDesc
to the slot->tts_tupleDescriptor
You can test it yourself.
first `git am v12-0001-introduce-json-format-for-COPY-TO.patch`
after that, comment out the memcpy call in CopyOneRowTo, just like the
following:
        if(!cstate->rel)
        {
            // memcpy(TupleDescAttr(slot->tts_tupleDescriptor, 0),
            //     TupleDescAttr(cstate->queryDesc->tupDesc, 0),
            //     cstate->queryDesc->tupDesc->natts *
sizeof(FormData_pg_attribute));
build and test with the attached script.
you will see COPY TO FORMAT JSON, lots of cases where the json key
becomes an empty string.
I think this thread related issues has been resolved.
| Attachment | Content-Type | Size | 
|---|---|---|
| scratch31.sql | application/sql | 1.5 KB | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Adrian Klaver | 2024-09-13 14:50:56 | Re: Manual query vs trigger during data load | 
| Previous Message | François SIMON | 2024-09-13 14:15:59 | Re: post-bootstrap init : permission denied pg_description | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tom Lane | 2024-09-13 14:56:19 | Re: Add system column support to the USING clause | 
| Previous Message | Peter Eisentraut | 2024-09-13 14:39:56 | Re: tiny step toward threading: reduce dependence on setlocale() |