Quick Links

Re: Pg 16: will pg_dump & pg_restore be faster?

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	Bruce Momjian <bruce(at)momjian(dot)us>
Cc:	David Rowley <dgrowleyml(at)gmail(dot)com>, Ron <ronljohnsonjr(at)gmail(dot)com>, pgsql-general(at)lists(dot)postgresql(dot)org
Subject:	Re: Pg 16: will pg_dump & pg_restore be faster?
Date:	2023-05-31 13:45:52
Message-ID:	20230531134552.yvouy5k573irlddt@alap3.anarazel.de
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Hi,

On 2023-05-30 21:13:08 -0400, Bruce Momjian wrote:
> On Wed, May 31, 2023 at 09:14:20AM +1200, David Rowley wrote:
> > On Wed, 31 May 2023 at 08:54, Ron <ronljohnsonjr(at)gmail(dot)com> wrote:
> > > https://www.postgresql.org/about/news/postgresql-16-beta-1-released-2643/
> > > says "PostgreSQL 16 can also improve the performance of concurrent bulk
> > > loading of data using COPY up to 300%."
> > >
> > > Since pg_dump & pg_restore use COPY (or something very similar), will the
> > > speed increase translate to higher speeds for those utilities?
> >
> > I think the improvements to relation extension only help when multiple
> > backends need to extend the relation at the same time. pg_restore can
> > have multiple workers, but the tasks that each worker performs are
> > only divided as far as an entire table, i.e. 2 workers will never be
> > working on the same table at the same time. So there is no concurrency
> > in terms of 2 or more workers working on loading data into the same
> > table at the same time.
> >
> > It might be an interesting project now that we have TidRange scans, to
> > have pg_dump split larger tables into chunks so that they can be
> > restored in parallel.
>
> Uh, the release notes say:
>
> 
>
> <listitem>
> <para>
> Allow more efficient addition of heap and index pages (Andres Freund)
> </para>
> </listitem>
>
> There is no mention of concurrency being a requirement. Is it wrong? I
> think there was a question of whether you had to add _multiple_ blocks
> ot get a benefit, not if concurrency was needed. This email about the
> release notes didn't mention the concurrent requirement:

> https://www.postgresql.org/message-id/20230521171341.jjxykfsefsek4kzj%40awork3.anarazel.de

There's multiple improvements that work together to get the overall
improvement. One part of that is filesystem interactions, another is holding
the relation extension lock for a *much* shorter time. The former helps
regardless of concurrency, the latter only with concurrency.

Regards,

Andres

In response to

Re: Pg 16: will pg_dump & pg_restore be faster? at 2023-05-31 01:13:08 from Bruce Momjian

Browse pgsql-general by date

	From	Date	Subject
Next Message	Tom Lane	2023-05-31 14:07:50	Re: Hash Index on Partitioned Table
Previous Message	peter.borissow@kartographia.com	2023-05-31 13:44:35	Hash Index on Partitioned Table