From: | vignesh C <vignesh21(at)gmail(dot)com> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>, Alexey Lesovsky <lesovsky(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Skipping logical replication transactions on subscriber side |
Date: | 2021-11-04 15:57:43 |
Message-ID: | CALDaNm3yJmXBZ3AL5ooif0zud4aYTgRbDo10FgmGMahgKe9LMg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Oct 29, 2021 at 10:55 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Thu, Oct 28, 2021 at 7:40 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Thu, Oct 28, 2021 at 10:36 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > On Wed, Oct 27, 2021 at 7:02 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > >
> > > > On Thu, Oct 21, 2021 at 10:29 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > > >
> > > > >
> > > > > I've attached updated patches.
> > >
> > > Thank you for the comments!
> > >
> > > >
> > > > Few comments:
> > > > ==============
> > > > 1. Is the patch cleaning tablesync error entries except via vacuum? If
> > > > not, can't we send a message to remove tablesync errors once tablesync
> > > > is successful (say when we reset skip_xid or when tablesync is
> > > > finished) or when we drop subscription? I think the same applies to
> > > > apply worker. I think we may want to track it in some way whether an
> > > > error has occurred before sending the message but relying completely
> > > > on a vacuum might be the recipe of bloat. I think in the case of a
> > > > drop subscription we can simply send the message as that is not a
> > > > frequent operation. I might be missing something here because in the
> > > > tests after drop subscription you are expecting the entries from the
> > > > view to get cleared
> > >
> > > Yes, I think we can have tablesync worker send a message to drop stats
> > > once tablesync is successful. But if we do that also when dropping a
> > > subscription, I think we need to do that only the transaction is
> > > committed since we can drop a subscription that doesn't have a
> > > replication slot and rollback the transaction. Probably we can send
> > > the message only when the subscritpion does have a replication slot.
> > >
> >
> > Right. And probably for apply worker after updating skip xid.
>
> I'm not sure it's better to drop apply worker stats after resetting
> skip xid (i.g., after skipping the transaction). Since the view is a
> cumulative view and has last_error_time, I thought we can have the
> apply worker stats until the subscription gets dropped. Since the
> error reporting message could get lost, no entry in the view doesn’t
> mean the worker doesn’t face an issue.
>
> >
> > > In other cases, we can remember the subscriptions being dropped and
> > > send the message to drop the statistics of them after committing the
> > > transaction but I’m not sure it’s worth having it.
> > >
> >
> > Yeah, let's not go to that extent. I think in most cases subscriptions
> > will have corresponding slots.
>
> Agreed.
>
> >
> > FWIW, we completely
> > > rely on pg_stat_vacuum_stats() for cleaning up the dead tables and
> > > functions. And we don't expect there are many subscriptions on the
> > > database.
> > >
> >
> > True, but we do send it for the database, so let's do it for the cases
> > you explained in the first paragraph.
>
> Agreed.
>
> I've attached a new version patch. Since the syntax of skipping
> transaction id is under the discussion I've attached only the error
> reporting patch for now.
Thanks for the updated patch, few comments:
1) This check and return can be moved above CreateTemplateTupleDesc so
that the tuple descriptor need not be created if there is no worker
statistics
+ BlessTupleDesc(tupdesc);
+
+ /* Get subscription worker stats */
+ wentry = pgstat_fetch_subworker(subid, subrelid);
+
+ /* Return NULL if there is no worker statistics */
+ if (wentry == NULL)
+ PG_RETURN_NULL();
+
+ /* Initialise values and NULL flags arrays */
+ MemSet(values, 0, sizeof(values));
+ MemSet(nulls, 0, sizeof(nulls));
2) "NULL for the main apply worker" is mentioned as "null for the main
apply worker" in case of pg_stat_subscription view, we can mention it
similarly.
+ <para>
+ OID of the relation that the worker is synchronizing; NULL for the
+ main apply worker
+ </para></entry>
3) Variable assignment can be done during declaration and this the
assignment can be removed
+ i = 0;
+ /* subid */
+ values[i++] = ObjectIdGetDatum(subid);
4) I noticed that the worker error is still present when queried from
pg_stat_subscription_workers even after conflict is resolved in the
subscriber and the worker proceeds with applying the other
transactions, should this be documented somewhere?
5) This needs to be aligned, the columns in select have used TAB, we
should align it using spaces.
+CREATE VIEW pg_stat_subscription_workers AS
+ SELECT
+ w.subid,
+ s.subname,
+ w.subrelid,
+ w.relid,
+ w.command,
+ w.xid,
+ w.error_count,
+ w.error_message,
+ w.last_error_time,
+ w.stats_reset
Regards,
Vignesh
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Geoghegan | 2021-11-04 15:58:29 | Re: Why doesn't GiST VACUUM require a super-exclusive lock, like nbtree VACUUM? |
Previous Message | Andrey Borodin | 2021-11-04 15:52:22 | Re: Why doesn't GiST VACUUM require a super-exclusive lock, like nbtree VACUUM? |