From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Bruce Momjian <bruce(at)momjian(dot)us> |
Cc: | David Rowley <dgrowleyml(at)gmail(dot)com>, Ron <ronljohnsonjr(at)gmail(dot)com>, pgsql-general(at)lists(dot)postgresql(dot)org |
Subject: | Re: Pg 16: will pg_dump & pg_restore be faster? |
Date: | 2023-05-31 13:45:52 |
Message-ID: | 20230531134552.yvouy5k573irlddt@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hi,
On 2023-05-30 21:13:08 -0400, Bruce Momjian wrote:
> On Wed, May 31, 2023 at 09:14:20AM +1200, David Rowley wrote:
> > On Wed, 31 May 2023 at 08:54, Ron <ronljohnsonjr(at)gmail(dot)com> wrote:
> > > https://www.postgresql.org/about/news/postgresql-16-beta-1-released-2643/
> > > says "PostgreSQL 16 can also improve the performance of concurrent bulk
> > > loading of data using COPY up to 300%."
> > >
> > > Since pg_dump & pg_restore use COPY (or something very similar), will the
> > > speed increase translate to higher speeds for those utilities?
> >
> > I think the improvements to relation extension only help when multiple
> > backends need to extend the relation at the same time. pg_restore can
> > have multiple workers, but the tasks that each worker performs are
> > only divided as far as an entire table, i.e. 2 workers will never be
> > working on the same table at the same time. So there is no concurrency
> > in terms of 2 or more workers working on loading data into the same
> > table at the same time.
> >
> > It might be an interesting project now that we have TidRange scans, to
> > have pg_dump split larger tables into chunks so that they can be
> > restored in parallel.
>
> Uh, the release notes say:
>
> <!--
> Author: Andres Freund <andres(at)anarazel(dot)de>
> 2023-04-06 [00d1e02be] hio: Use ExtendBufferedRelBy() to extend tables more eff
> Author: Andres Freund <andres(at)anarazel(dot)de>
> 2023-04-06 [26158b852] Use ExtendBufferedRelTo() in XLogReadBufferExtended()
> -->
>
> <listitem>
> <para>
> Allow more efficient addition of heap and index pages (Andres Freund)
> </para>
> </listitem>
>
> There is no mention of concurrency being a requirement. Is it wrong? I
> think there was a question of whether you had to add _multiple_ blocks
> ot get a benefit, not if concurrency was needed. This email about the
> release notes didn't mention the concurrent requirement:
> https://www.postgresql.org/message-id/20230521171341.jjxykfsefsek4kzj%40awork3.anarazel.de
There's multiple improvements that work together to get the overall
improvement. One part of that is filesystem interactions, another is holding
the relation extension lock for a *much* shorter time. The former helps
regardless of concurrency, the latter only with concurrency.
Regards,
Andres
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2023-05-31 14:07:50 | Re: Hash Index on Partitioned Table |
Previous Message | peter.borissow@kartographia.com | 2023-05-31 13:44:35 | Hash Index on Partitioned Table |