Re: Pg 16: will pg_dump & pg_restore be faster?

From: "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
To: David Rowley <dgrowleyml(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>
Cc: Ron <ronljohnsonjr(at)gmail(dot)com>, pgsql-general(at)lists(dot)postgresql(dot)org, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: Pg 16: will pg_dump & pg_restore be faster?
Date: 2023-06-02 12:14:45
Message-ID: d966a173-d66c-c873-a154-50ab822ea933@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 5/30/23 10:05 PM, David Rowley wrote:

> My understanding had been that concurrency was required, but I see the
> commit message for 00d1e02be mentions:
>
>> Even single threaded
>> COPY is measurably faster, primarily due to not dirtying pages while
>> extending, if supported by the operating system (see commit 4d330a61bb1).
>
> If that's the case then maybe the beta release notes could be edited
> slightly to reflect this. Maybe something like:
>
> "Relation extensions have been improved allowing faster bulk loading
> of data using COPY. These improvements are more significant when
> multiple processes are concurrently loading data into the same table."
>
> The current text of "PostgreSQL 16 can also improve the performance of
> concurrent bulk loading of data using COPY up to 300%." does lead me
> to believe that nothing has been done to improve things when only a
> single backend is involved.

Typically once a release announcement is out, we'll only edit it if it's
inaccurate. I don't think the statement in the release announcement is
inaccurate, as it specifies that concurrent bulk loading is faster.

I had based the description on what Andres described in the original
discussion and through reading[1], which showed a "measurable"
improvement as the commit message said, but it was not to the same
degree as concurrently loading. It does still seem impactful -- the
results show up to 20% improvement on a single backend -- but the bigger
story was around the concurrency.

I'm -0.5 for revising the announcement, but I also don't want people to
miss out on testing this. I'd be OK with this:

"PostgreSQL 16 can also improve the performance of bulk loading of data,
with some tests showing using up to 300% improvement when concurrently
executing `COPY` commands."

Thanks,

Jonathan

[1]
https://www.postgresql.org/message-id/20221029025420.eplyow6k7tgu6he3@awork3.anarazel.de

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Oliver Kohll 2023-06-02 13:36:22 Interconnected views
Previous Message Tom Lane 2023-06-02 12:07:12 Re: [Beginner Question]A question about yacc & lex