Re: Logical Replication of sequences

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Peter Smith <smithpb2250(at)gmail(dot)com>
Cc: Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Hou, Zhijie/侯 志杰 <houzj(dot)fnst(at)fujitsu(dot)com>, "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
Subject: Re: Logical Replication of sequences
Date: 2024-07-31 09:33:15
Message-ID: CALDaNm0=NMfJEhniSU9UoynsHVHHWznEVvafBv25vK_cos03DA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 20 Jul 2024 at 20:48, vignesh C <vignesh21(at)gmail(dot)com> wrote:
>
> On Fri, 12 Jul 2024 at 08:22, Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> >
> > Hi Vignesh. Here are the rest of my comments for patch v20240705-0003.
> > ======
> >
> > 8. logicalrep_sequence_sync_worker_find
> >
> > +/*
> > + * Walks the workers array and searches for one that matches given
> > + * subscription id.
> > + *
> > + * We are only interested in the sequence sync worker.
> > + */
> > +LogicalRepWorker *
> > +logicalrep_sequence_sync_worker_find(Oid subid, bool only_running)
> >
> > There are other similar functions for walking the workers array to
> > search for a worker. Instead of having different functions for
> > different cases, wouldn't it be cleaner to combine these into a single
> > function, where you pass a parameter (e.g. a mask of worker types that
> > you are interested in finding)?

This is fixed in the v20240730_2 version attached at [1].

> > 17.
> > Also, where does the number 100 come from? Why not 1000? Why not 10?
> > Why have batching at all? Maybe there should be some comment to
> > describe the reason and the chosen value.

I had run some tests with 10/100 and 1000 sequences per batch for
10000 sequences. The results for it:
10 per batch - 4.94 seconds
100 per batch - 4.87 seconds
1000 per batch - 4.53 seconds

There is not much time difference between each of them. Currently, it
is set to 100, which seems fine since it will not generate a lot of
transactions. Additionally, the locks on the sequences will be
periodically released during the commit transaction.

I had used the test from the attached patch by changing
max_sequences_sync_per_batch to 10/100/100 in 035_sequences.pl to
verify this.

[1] - https://www.postgresql.org/message-id/CALDaNm3%2BXzHAbgyn8gmbBLK5goyv_uyGgHEsTQxRZ8bVk6nAEg%40mail.gmail.com

Regards,
Vignesh

Attachment Content-Type Size
sequence_perf.txt text/plain 463 bytes
0001-Performance-testing-changes.patch application/x-patch 8.8 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Laurenz Albe 2024-07-31 09:45:00 Re: proposal: schema variables
Previous Message Peter Eisentraut 2024-07-31 09:27:58 Re: Proposal: Document ABI Compatibility