Re: Logical Replication of sequences

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: vignesh C <vignesh21(at)gmail(dot)com>
Cc: shveta malik <shveta(dot)malik(at)gmail(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Hou, Zhijie/侯 志杰 <houzj(dot)fnst(at)fujitsu(dot)com>, "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
Subject: Re: Logical Replication of sequences
Date: 2024-08-13 07:00:53
Message-ID: CAHut+PsrE75WNfyGy7fLx3XNsN_jS-dz0BJskSU=Fq0wO2KJUw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Vignesh,

I have been using the latest patchset, trying a few things using many
(1000) sequences.

Here are some observations, plus some suggestions for consideration.

~~~~~

OBSERVATION #1

When 1000s of sequences are refreshed using REFRESH PUBLICATION
SEQUENCES the logging is excessive. For example, since there is only
one sequencesync worker why does it need to broadcast that it is
"finished" separately for every single sequence. That is giving 1000s
of lines of logs which don't seem to be of much interest to a user.

...
2024-08-13 16:17:04.151 AEST [5002] LOG: logical replication
synchronization for subscription "sub3", sequence "seq_0918" has
finished
2024-08-13 16:17:04.151 AEST [5002] LOG: logical replication
synchronization for subscription "sub3", sequence "seq_0919" has
finished
2024-08-13 16:17:04.151 AEST [5002] LOG: logical replication
synchronization for subscription "sub3", sequence "seq_0920" has
finished
2024-08-13 16:17:04.151 AEST [5002] LOG: logical replication
synchronization for subscription "sub3", sequence "seq_0921" has
finished
2024-08-13 16:17:04.151 AEST [5002] LOG: logical replication
synchronization for subscription "sub3", sequence "seq_0922" has
finished
2024-08-13 16:17:04.151 AEST [5002] LOG: logical replication
synchronization for subscription "sub3", sequence "seq_0923" has
finished
...

Perhaps just LOG when each "batch" is completed, but the individual
sequence finished logs can just be DEBUG information?

~~~~~

OBSERVATION #2

When 1000s of sequences are refreshed (set to INIT) then there are
1000s of logs like below:

...
2024-08-13 16:13:57.873 AEST [10301] LOG: sequence "public.seq_0698"
of subscription "sub3" set to INIT state
2024-08-13 16:13:57.873 AEST [10301] STATEMENT: alter subscription
sub3 refresh publication sequences;
2024-08-13 16:13:57.873 AEST [10301] LOG: sequence "public.seq_0699"
of subscription "sub3" set to INIT state
2024-08-13 16:13:57.873 AEST [10301] STATEMENT: alter subscription
sub3 refresh publication sequences;
2024-08-13 16:13:57.873 AEST [10301] LOG: sequence "public.seq_0700"
of subscription "sub3" set to INIT state
2024-08-13 16:13:57.873 AEST [10301] STATEMENT: alter subscription
sub3 refresh publication sequences;
2024-08-13 16:13:57.873 AEST [10301] LOG: sequence "public.seq_0701"
of subscription "sub3" set to INIT state
2024-08-13 16:13:57.873 AEST [10301] STATEMENT: alter subscription
sub3 refresh publication sequences;
2024-08-13 16:13:57.874 AEST [10301] LOG: sequence "public.seq_0702"
of subscription "sub3" set to INIT state
2024-08-13 16:13:57.874 AEST [10301] STATEMENT: alter subscription
sub3 refresh publication sequences;
...

I felt that showing the STATEMENT for all of these is overkill. How
about changing that ereport LOG so it does not emit the statement 1000
times? Or, maybe you can implement it as a "dynamic" log that emits
the STATEMENT if there are only a few logs a few times but skips it
for the next 995 logs.

~~~~~

OBSERVATION #3

The WARNING about mismatched sequences currently looks like this:

2024-08-13 16:41:45.496 AEST [10301] WARNING: Parameters differ for
remote and local sequences "public.seq_0999"
2024-08-13 16:41:45.496 AEST [10301] HINT: Alter/Re-create the
sequence using the same parameter as in remote.

Although you could probably deduce it from nearby logs, I think it
might be more helpful to also identify the subscription name in this
WARNING message. Otherwise, if there are many publications the user
may have no idea where the mismatched "remote" is coming from.

~~~~

OBSERVATION #4

When 1000s of sequences are refreshed then there are 1000s of
associated logs. But (given there is only one sequencesync worker)
those logs are not always the order that I was expecting to see them.

e.g.
...
2024-08-13 16:41:47.436 AEST [11735] LOG: logical replication
synchronization for subscription "sub3", sequence "seq_0885" has
finished
2024-08-13 16:41:47.436 AEST [11735] LOG: logical replication
synchronization for subscription "sub3", sequence "seq_0887" has
finished
2024-08-13 16:41:47.436 AEST [11735] LOG: logical replication
synchronization for subscription "sub3", sequence "seq_0888" has
finished
2024-08-13 16:41:47.436 AEST [11735] LOG: logical replication
synchronization for subscription "sub3", sequence "seq_0889" has
finished
2024-08-13 16:41:47.436 AEST [11735] LOG: logical replication
synchronization for subscription "sub3", sequence "seq_0890" has
finished
2024-08-13 16:41:47.436 AEST [11735] LOG: logical replication
synchronization for subscription "sub3", sequence "seq_0906" has
finished
2024-08-13 16:41:47.436 AEST [11735] LOG: logical replication
synchronization for subscription "sub3", sequence "seq_0566" has
finished
2024-08-13 16:41:47.436 AEST [11735] LOG: logical replication
synchronization for subscription "sub3", sequence "seq_0568" has
finished
2024-08-13 16:41:47.436 AEST [11735] LOG: logical replication
synchronization for subscription "sub3", sequence "seq_0569" has
finished
2024-08-13 16:41:47.436 AEST [11735] LOG: logical replication
synchronization for subscription "sub3", sequence "seq_0570" has
finished
2024-08-13 16:41:47.436 AEST [11735] LOG: logical replication
synchronization for subscription "sub3", sequence "seq_0571" has
finished
2024-08-13 16:41:47.436 AEST [11735] LOG: logical replication
synchronization for subscription "sub3", sequence "seq_0582" has
finished
...

Is there a way to refresh sequences in a more natural (e.g.
alphabetical) order to make these logs more readable?

======
Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message 胡常齐 2024-08-13 07:26:41 Re: Re: PG buildfarm member cisticola
Previous Message Heikki Linnakangas 2024-08-13 06:23:10 Re: Thread-safe nl_langinfo() and localeconv()