Re: Make COPY format extendable: Extract COPY TO format implementations

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Sutou Kouhei <kou(at)clear-code(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Make COPY format extendable: Extract COPY TO format implementations
Date: 2024-10-10 22:55:34
Message-ID: CAD21AoBfLWbe3GtD3E8zLJtzvk49=ho21j8drfp6GwdbhLD=LQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Oct 8, 2024 at 8:34 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Mon, Oct 07, 2024 at 03:23:08PM -0700, Masahiko Sawada wrote:
> > In the benchmark, I've applied the v20 patch set and 'master' in the
> > result refers to a19f83f87966. And I disabled CPU turbo boost where
> > possible. Overall, v20 patch got a similar or better performance in
> > both COPY FROM and COPY TO compared to master except for on MacOS.
> > I'm not sure that changes made to master since the last benchmark run by
> > Tomas and Suto-san might contribute to these results.
>
> Don't think so. FWIW, I have been looking at the set of tests with
> previous patch versions around v7 and v10 I have done, and did notice
> a similar pattern where COPY FROM was getting slightly better for text
> and binary. It did not look like only noise involved, and it was
> kind of reproducible. As long as we avoid the function pointer
> redirection for the per-row processing when dealing with in-core
> formats, we should be fine as far as I understand. That's what the
> latest patch set is doing based on a read of v21.

Yeah, what v21 patch is doing makes sense to me.

>
> > I'll try to investigate the performance regression that happened on MacOS.
>
> I don't have a good explanation for this one. Did you mount the data
> folder on a tmpfs and made sure that all the workloads were
> CPU-bounded?

Yes, I used tmpfs and workloads were CPU-bound.

>
> > I think that other performance differences in my results seem to be within
> > noises and could be acceptable. Of course, it would be great if others
> > also could try to run benchmark tests.
>
> Yeah. At 1~2% it could be noise, but there are reproducible 1~2%
> evolutions. In the good sense here, it means.

In real workloads, COPY FROM/TO operations would be more disk I/O
bound. I think that 1~2% performance differences that were shown in
CPU-bound workload would not be a problem in practice.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jacob Champion 2024-10-10 23:08:50 Re: [PoC] Federated Authn/z with OAUTHBEARER
Previous Message Tom Lane 2024-10-10 22:39:35 Re: sunsetting md5 password support