Re: Make COPY format extendable: Extract COPY TO format implementations

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Sutou Kouhei <kou(at)clear-code(dot)com>, zhjwpku(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Make COPY format extendable: Extract COPY TO format implementations
Date: 2025-02-05 06:20:51
Message-ID: CAD21AoBkDE4JwjPgcLxSEwqu3nN4VXjkYS9vpRQDwA2GwNQCsg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Feb 4, 2025 at 9:10 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Sat, Feb 01, 2025 at 07:12:01PM +0900, Sutou Kouhei wrote:
> > For the propose, copyapi.h should not include
> > copy{to,from}_internal.h. If we do it, copyto.c includes
> > CopyFromState and copyfrom*.c include CopyToState.
> >
> > What do you think about the following change? Note that
> > extensions must include copy{to,from}_internal.h explicitly
> > in addition to copyapi.h.
>
> I was just looking at bit at this series of patch labelled with v31,
> to see what is happening here.
>
> In 0001, we have that:
>
> + /* format-specific routines */
> + const CopyToRoutine *routine;
> [...]
> - CopySendEndOfRow(cstate);
> + cstate->routine->CopyToOneRow(cstate, slot);
>
> Having a callback where the copy state is processed once per row is
> neat in terms of design for the callbacks and what extensions can do,
> and this is much better than what 2889fd23be5 has attempted (later
> reverted in 1aa8324b81fa) because we don't do indirect function calls
> for each attribute. Still, I have a question here: what happens for a
> COPY TO that involves one attribute, a short field size like an int2
> and many rows (the more rows the more pronounced the effect, of
> course)? Could this level of indirection still be the cause of some
> regressions in a case like that? This is the worst case I can think
> about, on top of my mind, and I am not seeing tests with few
> attributes like this one, where we would try to make this callback as
> hot as possible. This is a performance-sensitive area.

FYI when Sutou-san last measured the performance[1], it showed a
slight speed up even with fewer columns (5 columns) in both COPY TO
and COPY FROM cases. The callback design has not changed since then.
But it would be a good idea to run the benchmark with a table having a
single small size column.

Regards,

[1] https://www.postgresql.org/message-id/20241114.161948.1677325020727842666.kou%40clear-code.com

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2025-02-05 06:28:04 Re: Test to dump and restore objects left behind by regression
Previous Message Michael Paquier 2025-02-05 06:12:53 Re: Add isolation test template in injection_points for wait/wakeup/detach