From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | Greg Nancarrow <gregn4422(at)gmail(dot)com> |
Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com>, "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Alexey Lesovsky <lesovsky(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Skipping logical replication transactions on subscriber side |
Date: | 2021-09-05 13:41:20 |
Message-ID: | CAD21AoCu+VoSLKqKmVhZToXn5+84k_skSZRAQRXRAN+ObuZfaA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Sep 2, 2021 at 12:06 PM Greg Nancarrow <gregn4422(at)gmail(dot)com> wrote:
>
> On Mon, Aug 30, 2021 at 5:07 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> >
> > I've attached rebased patches. 0004 patch is not the scope of this
> > patch. It's borrowed from another thread[1] to fix the assertion
> > failure for newly added tests. Please review them.
> >
>
> I have some initial feedback on the v12-0001 patch.
> Most of these are suggested improvements to wording, and some typo fixes.
Thank you for the comments!
>
>
> (0) Patch comment
>
> Suggestion to improve the patch comment:
>
> BEFORE:
> Add pg_stat_subscription_errors statistics view.
>
> This commits adds new system view pg_stat_logical_replication_error,
Oops, I realized that it should be pg_stat_subscription_errors.
> showing errors happening during applying logical replication changes
> as well as during performing initial table synchronization.
>
> The subscription error entries are removed by autovacuum workers when
> the table synchronization competed in table sync worker cases and when
> dropping the subscription in apply worker cases.
>
> It also adds SQL function pg_stat_reset_subscription_error() to
> reset the single subscription error.
>
> AFTER:
> Add a subscription errors statistics view "pg_stat_subscription_errors".
>
> This commits adds a new system view pg_stat_logical_replication_errors,
> that records information about any errors which occur during application
> of logical replication changes as well as during performing initial table
> synchronization.
I think that views don't have any data so "show information" seems
appropriate to me here. Thoughts?
>
> The subscription error entries are removed by autovacuum workers after
> table synchronization completes in table sync worker cases and after
> dropping the subscription in apply worker cases.
>
> It also adds an SQL function pg_stat_reset_subscription_error() to
> reset a single subscription error.
>
>
>
> doc/src/sgml/monitoring.sgml:
>
> (1)
> BEFORE:
> + <entry>One row per error that happened on subscription, showing
> information about
> + the subscription errors.
> AFTER:
> + <entry>One row per error that occurred on subscription,
> providing information about
> + each subscription error.
Fixed.
>
> (2)
> BEFORE:
> + The <structname>pg_stat_subscription_errors</structname> view will
> contain one
> AFTER:
> + The <structname>pg_stat_subscription_errors</structname> view contains one
>
I think that descriptions of other statistics view also say "XXX view
will contain ...".
>
> (3)
> BEFORE:
> + Name of the database in which the subscription is created.
> AFTER:
> + Name of the database in which the subscription was created.
Fixed.
>
> (4)
> BEFORE:
> + OID of the relation that the worker is processing when the
> + error happened.
> AFTER:
> + OID of the relation that the worker was processing when the
> + error occurred.
>
Fixed.
>
> (5)
> BEFORE:
> + Name of command being applied when the error happened. This
> + field is always NULL if the error is reported by
> + <literal>tablesync</literal> worker.
> AFTER:
> + Name of command being applied when the error occurred. This
> + field is always NULL if the error is reported by a
> + <literal>tablesync</literal> worker.
Fixed.
> (6)
> BEFORE:
> + Transaction ID of publisher node being applied when the error
> + happened. This field is always NULL if the error is reported
> + by <literal>tablesync</literal> worker.
> AFTER:
> + Transaction ID of the publisher node being applied when the error
> + happened. This field is always NULL if the error is reported
> + by a <literal>tablesync</literal> worker.
Fixed.
> (7)
> BEFORE:
> + Type of worker reported the error: <literal>apply</literal> or
> + <literal>tablesync</literal>.
> AFTER:
> + Type of worker reporting the error: <literal>apply</literal> or
> + <literal>tablesync</literal>.
Fixed.
>
> (8)
> BEFORE:
> + Number of times error happened on the worker.
> AFTER:
> + Number of times the error occurred in the worker.
>
> [or "Number of times the worker reported the error" ?]
I prefer "Number of times the error occurred in the worker."
>
> (9)
> BEFORE:
> + Time at which the last error happened.
> AFTER:
> + Time at which the last error occurred.
Fixed.
>
> (10)
> BEFORE:
> + Error message which is reported last failure time.
> AFTER:
> + Error message which was reported at the last failure time.
>
> Maybe this should just say "Last reported error message" ?
Fixed.
>
>
> (11)
> You shouldn't call hash_get_num_entries() on a NULL pointer.
>
> Suggest swappling lines, as shown below:
>
> BEFORE:
> + int32 nerrors = hash_get_num_entries(subent->suberrors);
> +
> + /* Skip this subscription if not have any errors */
> + if (subent->suberrors == NULL)
> + continue;
> AFTER:
> + int32 nerrors;
> +
> + /* Skip this subscription if not have any errors */
> + if (subent->suberrors == NULL)
> + continue;
> + nerrors = hash_get_num_entries(subent->suberrors);
Right. Fixed.
>
>
> (12)
> Typo: legnth -> length
>
> + * contains the fixed-legnth error message string which is
Fixed.
>
>
> src/backend/postmaster/pgstat.c
>
> (13)
> "Subscription stat entries" hashtable is created in two different
> places, one with HASH_CONTEXT and the other without. Is this
> intentional?
> Shouldn't there be a single function for creating this?
Yes, it's intentional. It's consistent with hash tables for other statistics.
>
>
> (14)
> + * PgStat_MsgSubscriptionPurge Sent by the autovacuum purge the subscriptions.
>
> Seems to be missing a word, is it meant to say "Sent by the autovacuum
> to purge the subscriptions." ?
Yes, fixed.
>
> (15)
> + * PgStat_MsgSubscriptionErrPurge Sent by the autovacuum purge the subscription
> + * errors.
>
> Seems to be missing a word, is it meant to say "Sent by the autovacuum
> to purge the subscription errors." ?
Thanks, fixed.
Regards,
--
Masahiko Sawada
EDB: https://www.enterprisedb.com/
From | Date | Subject | |
---|---|---|---|
Next Message | Masahiko Sawada | 2021-09-05 13:41:43 | Re: Skipping logical replication transactions on subscriber side |
Previous Message | Thomas Munro | 2021-09-05 13:32:55 | Re: stat() vs ERROR_DELETE_PENDING, round N + 1 |