Re: Native Logical Replication Initial Import Qs

From: Don Seiler <don(at)seiler(dot)us>
To: Jeremy Schneider <schneider(at)ardentperf(dot)com>
Cc: "pgsql-generallists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Native Logical Replication Initial Import Qs
Date: 2023-06-22 15:30:39
Message-ID: CAHJZqBC+FWMoHSE5oqc=nYDL0Q5LzHC4gnF3EdVS86R8Ph8FUQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Jun 7, 2023 at 4:30 PM Jeremy Schneider <schneider(at)ardentperf(dot)com>
wrote:

> On 6/7/23 2:12 PM, Don Seiler wrote:
> > On the logical replication front, the concern is with the initial data
> > import that happens when the subscription is created (by default). I
> > know that you can tell the subscription to not copy data and instead use
> > pg_dump and a replication slot snapshot to achieve this manually.
> > However I'm unable to explain (to myself) why this is better than just
> > having the subscription do it upon creation. Given that I can create
> > pub/sub sets for individual tables for parallel operations, I'm curious
> > what advantages there are in using pg_dump to do this import.
>
> FWIW, I think the place this feature shines the most is when you can
> safely leverage things like storage-level snapshots. Sometimes that
> means you can get a copy of a multi-TB database almost instantly if the
> storage or filesystem does copy-on-write, for example database lab
> (postgres.ai) which uses ZFS.
>
> Another thing I can think of is that while it's true you can create
> multiple pub/sub sets, I'm not sure you can reduce the number of sets
> later. So if you were concerned about having too many slots doing
> decoding on the source, then you might want the flexibility of pg_dump
> (or perhaps restoring a backup) to get more parallelism while having
> more control over how many slots will be used later.
>
> In your case, the whole setup is hopefully temporary, so maybe these
> particular concerns aren't as relevant to you.
>
> This is just what comes to mind... probably there's a few more things
> I'm not thinking of and hopefully others will chime in. :)
>

Yes my setup would be temporary, just as long as needed to complete the
migration to a new host.

One other use case is using pg_dump/restore to move older data in
time-based partitions (assuming data isn't modified for older partition
periods). Those older partitions would just need to be copied but wouldn't
need any ongoing pub/sub/replication set up for them. So you'd have both in
use in that case.

Thanks,
Don.

--
Don Seiler
www.seiler.us

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2023-06-22 16:02:07 Re: a really dumb password question
Previous Message Martin Mueller 2023-06-22 15:05:52 a really dumb password question