Re: Make COPY format extendable: Extract COPY TO format implementations

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Sutou Kouhei <kou(at)clear-code(dot)com>
Cc: michael(at)paquier(dot)xyz, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Make COPY format extendable: Extract COPY TO format implementations
Date: 2024-11-19 01:02:41
Message-ID: CAD21AoC=DX5QQVb27C6UdpPfY-F=-PGnQ1u6rWo69DV=4EtDdw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 14, 2024 at 4:04 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Wed, Nov 13, 2024 at 11:19 PM Sutou Kouhei <kou(at)clear-code(dot)com> wrote:
> >
> > Hi,
> >
> > In <20241105(dot)174328(dot)1705956947135248653(dot)kou(at)clear-code(dot)com>
> > "Re: Make COPY format extendable: Extract COPY TO format implementations" on Tue, 05 Nov 2024 17:43:28 +0900 (JST),
> > Sutou Kouhei <kou(at)clear-code(dot)com> wrote:
> >
> > >> I've further investigated the performance regression, and found out it
> > >> might be relevant that the compiler doesn't inline the
> > >> CopyFromTextLikeOneRow() function. It might be worth testing with
> > >> pg_attribute_always_inline instead of 'inline' as below:
> > >
> > > Wow! Good catch!
> > >
> > > I've rebased on the current master and updated the v20 and
> > > v21 patch sets with "pg_attribute_always_inline" not
> > > "inline".
> > >
> > > The v22 patch set is for the v20 patch set.
> > > (TO/FROM changes are in one commit.)
> > >
> > > The v23 patch set is for the v21 patch set.
> > > (TO/FROM changes are separated for easy to merge only FROM
> > > or TO part.)
> >
> > I've run benchmark on my machine that has "Intel(R) Core(TM)
> > i7-3770 CPU @ 3.40GHz".
> >
> > Summary:
> > * "pg_attribute_always_inline" is effective for the "COPY
> > FROM" part
> > * "pg_attribute_always_inline" may not be needed for the
> > "COPY TO" part
> >
> >
> > v20-result.pdf: This is the same result PDF attached in
> > https://www.postgresql.org/message-id/20241008.173918.995935870630354246.kou%40clear-code.com
> > . This is the base line for "pg_attribute_always_inline"
> > change.
> >
> > v22-result.pdf: This is a new result PDF for the v22 patch
> > set.
> >
> > COPY FROM:
> >
> > 0001: The v22 patch set is slower than HEAD. This just
> > introduces "routine" abstraction. It increases overhead. So
> > this is expected.
> >
> > 0002-0005: The v22 patch set is faster than HEAD for all
> > cases. The v20 patch set is slower than HEAD for smaller
> > data. This shows that "pg_attribute_always_inline" for the
> > "COPY FROM" part is effective on my machine too.
> >
> >
> > COPY TO:
> >
> > 0001: The v22 patch set is slower than HEAD. This is
> > as expected for the same reason as COPY FROM.
> >
> > 0002-0004: The v22 patch set is slower than HEAD. (The v20
> > patch set is faster than HEAD.) This is not expected.
> >
> > 0005: The v22 patch set is faster than HEAD. This is
> > expected. But 0005 just exports some functions. It doesn't
> > change existing logic. So it's strange...
> >
> > This shows "pg_attribute_always_inline" is needless for the
> > "COPY TO" part.
> >
> >
> > I also tried the v24 patch set:
> > * The "COPY FROM" part is same as the v22 patch set
> > ("pg_attribute_always_inline" is used.)
> > * The "COPY TO" part is same as the v20 patch set
> > ("pg_attribute_always_inline" is NOT used.)
> >
> >
> > (I think that the v24 patch set isn't useful for others. So
> > I don't share it here. If you're interested in it, I'll
> > share it here.)
> >
> > v24-result.pdf:
> >
> > COPY FROM: The same trend as the v22 patch set result. It's
> > expected because the "COPY FROM" part is the same as the v22
> > patch set.
> >
> > COPY TO: The v24 patch set is faster than the v22 patch set
> > but the v24 patch set isn't same trend as the v20 patch
> > set. This is not expected because the "COPY TO" part is the
> > same as the v20 patch set.
> >
> >
> > Anyway, the 0005 "COPY TO" parts are always faster than
> > HEAD. So we can use either "pg_attribute_always_inline" or
> > "inline".
> >
> >
> > Summary:
> > * "pg_attribute_always_inline" is effective for the "COPY
> > FROM" part
> > * "pg_attribute_always_inline" may not be needed for the
> > "COPY TO" part
> >
> >
> > Can we proceed this proposal with these results? Or should
> > we wait for more benchmark results?
>
> Thank you for sharing the benchmark test results! I think these
> results are good for us to proceed. I'll closely look at COPY TO
> results and review v22 patch sets.

I have a question about v22. We use pg_attribute_always_inline for
some functions to avoid function call overheads. Applying it to
CopyToTextLikeOneRow() and CopyFromTextLikeOneRow() are legitimate as
we've discussed. But there are more function where the patch applied
it to:

-bool
-NextCopyFromRawFields(CopyFromState cstate, char ***fields, int *nfields)
+static pg_attribute_always_inline bool
+NextCopyFromRawFields(CopyFromState cstate, char ***fields, int
*nfields, bool is_csv)

-static bool
-CopyReadLineText(CopyFromState cstate)
+static pg_attribute_always_inline bool
+CopyReadLineText(CopyFromState cstate, bool is_csv)

+static pg_attribute_always_inline void
+CopyToTextLikeSendEndOfRow(CopyToState cstate)

I think it's out of scope of this patch even if these changes are
legitimate. Is there any reason for these changes?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andy Fan 2024-11-19 01:03:57 Document for wal_log_hints
Previous Message Michael Paquier 2024-11-19 00:57:57 Re: psql: Add leakproof field to \dAo+ meta-command results