Re: Split copy.c

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, vignesh C <vignesh21(at)gmail(dot)com>
Subject: Re: Split copy.c
Date: 2020-11-16 10:25:45
Message-ID: bd19fa6e-d0f4-9009-44cd-6e885a4c7236@iki.fi
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 16/11/2020 04:28, Justin Pryzby wrote:
> On Tue, Nov 03, 2020 at 03:15:27PM +1300, David Rowley wrote:
>> On Tue, 3 Nov 2020 at 07:35, Andres Freund <andres(at)anarazel(dot)de> wrote:
>>>
>>> On 2020-11-02 19:43:38 +0200, Heikki Linnakangas wrote:
>>>> On 02/11/2020 19:23, Andres Freund wrote:
>>>>> On 2020-11-02 11:03:29 +0200, Heikki Linnakangas wrote:
>>>>>> There isn't much common code between COPY FROM and COPY TO, so I propose
>>>>>> that we split copy.c into two: copyfrom.c and copyto.c. See attached. I thin
>>>>>> that's much nicer.
>>>>>
>>>>> Not quite convinced that's the right split - or perhaps there's just
>>>>> more potential. My feeling is that splitting out all the DML related
>>>>> code would make the code considerably easier to read.
>>>>
>>>> What do you mean by DML related code?
>>>
>>> Basically all the insertion related code (e.g CopyMultiInsert*, lots of
>>> code in CopyFrom()) and perhaps also the type input invocations.
>>
>> I quite like the fact that those are static and inline-able. I very
>> much imagine there'd be a performance hit if we moved them out to
>> another .c file and made them extern. Some of those functions can be
>> quite hot when copying into a partitioned table.
>
> For another patch [0], I moved into copy.h:
> +typedef struct CopyMultiInsertBuffer
> +typedef struct CopyMultiInsertInfo
> +CopyMultiInsertBufferInit(ResultRelInfo *rri)
> +CopyMultiInsertInfoSetupBuffer(CopyMultiInsertInfo *miinfo,
> +CopyMultiInsertInfoIsFull(CopyMultiInsertInfo *miinfo)
> +CopyMultiInsertBufferCleanup(CopyMultiInsertInfo *miinfo,
> +CopyMultiInsertInfoNextFreeSlot(CopyMultiInsertInfo *miinfo,
> +CopyMultiInsertInfoStore(CopyMultiInsertInfo *miinfo, ResultRelInfo *rri,
>
> That's an experimental part 0002 of my patch in response to Simon's suggestion.
> Maybe your response will be that variants of those interfaces should be added
> to nodeModifyTable.[ch] instead of moving them.

Nice. I don't think that affects this patch too much.

I would suggest renaming renaming the functions and structs to remove
the "Copy"-prefix. COPY uses them, but so does INSERT with the patch.

> Currently I'm passing (void*)mtstate as cstate - if there were a
> generic interface, that would be a void *state or so.
The functions only need cstate/mtstate to set the line number, for the
error callback, and to access the transition_capture field. You could
add a field for transition_capture in CopyMultiInsertInfo. For the line
number, you could add a line number field in CopyMultiInsertInfo, set
that in CopyMultiInsertBufferFlush() instead of cstate->cur_lineno, and
teach CopyFromErrorCallback() to get the line number from there.

- Heikki

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2020-11-16 10:41:51 Re: Online verification of checksums
Previous Message a.pervushina 2020-11-16 10:09:30 Re: [HACKERS] make async slave to wait for lsn to be replayed