From: | Peter Smith <smithpb2250(at)gmail(dot)com> |
---|---|
To: | "tanghy(dot)fnst(at)fujitsu(dot)com" <tanghy(dot)fnst(at)fujitsu(dot)com> |
Cc: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Design of pg_stat_subscription_workers vs pgstats |
Date: | 2022-02-24 08:53:33 |
Message-ID: | CAHut+Pv=VmXtHmPKp4fg8VDF+TQP6xWgL91Jn-hrqg5QObfCZA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi. Below are my review comments for the v2 patch.
======
1. Commit message
This patch changes the pg_stat_subscription_workers view (introduced
by commit 8d74fc9) so that it stores only statistics counters:
apply_error_count and sync_error_count, and has one entry for
subscription.
SUGGESTION
"and has one entry for subscription." --> "and has one entry for each
subscription."
~~~
2. Commit message
After removing these error details, there are no longer relation
information, so the subscription statistics are now a cluster-wide
statistics.
SUGGESTION
"there are no longer relation information," --> "there is no longer
any relation information,"
~~~
3. doc/src/sgml/monitoring.sgml
- <para>
- The error message
+ Number of times the error occurred during the application of changes
</para></entry>
</row>
<row>
<entry role="catalog_table_entry"><para role="column_definition">
- <structfield>last_error_time</structfield> <type>timestamp
with time zone</type>
+ <structfield>sync_error_count</structfield> <type>bigint</type>
</para>
<para>
- Last time at which this error occurred
+ Number of times the error occurred during the initial table
+ synchronization
</para></entry>
SUGGESTION (both places)
"Number of times the error occurred ..." --> "Number of times an error
occurred ..."
~~~
4. doc/src/sgml/monitoring.sgml - missing column
(duplicate - also reported by [Tang-v2])
The PG docs for the new "stats_reset" column are missing.
~~~
5. src/backend/catalog/system_views.sql - whitespace
(duplicate - also reported by [Tang-v2])
- JOIN pg_subscription s ON (w.subid = s.oid);
+ a.apply_error_count,
+ a.sync_error_count,
+ a.stats_reset
+ FROM pg_subscription as s,
+ pg_stat_get_subscription_activity(oid) as a;
inconsistent tab/space indenting for 'a.stats_reset'.
~~~
6. src/backend/postmaster/pgstat.c - function name
+/* ----------
+ * pgstat_reset_subscription_counter() -
+ *
+ * Tell the statistics collector to reset a single subscription
+ * counter, or all subscription counters (when subid is InvalidOid).
+ *
+ * Permission checking for this function is managed through the normal
+ * GRANT system.
+ * ----------
+ */
+void
+pgstat_reset_subscription_counter(Oid subid)
SUGGESTION (function name)
"pgstat_reset_subscription_counter" -->
"pgstat_reset_subscription_counters" (plural)
~~
7. src/backend/postmaster/pgstat.c - pgstat_recv_resetsubcounter
@@ -5645,6 +5598,51 @@
pgstat_recv_resetreplslotcounter(PgStat_MsgResetreplslotcounter *msg,
}
}
+/* ----------
+ * pgstat_recv_resetsubcounter() -
+ *
+ * Reset some subscription statistics of the cluster.
+ * ----------
+ */
+static void
+pgstat_recv_resetsubcounter(PgStat_MsgResetsubcounter *msg, int len)
"Reset some" seems a bit vague. Why not describe that it is all or
none according to the msg->m_subid?
~~~
8. src/backend/postmaster/pgstat.c - pgstat_recv_resetsubcounter
+ if (!OidIsValid(msg->m_subid))
+ {
+ HASH_SEQ_STATUS sstat;
+
+ /* Clear all subscription counters */
+ hash_seq_init(&sstat, subscriptionStatHash);
+ while ((subentry = (PgStat_StatSubEntry *) hash_seq_search(&sstat)) != NULL)
+ pgstat_reset_subscription(subentry, ts);
+ }
+ else
+ {
+ /* Get the subscription statistics to reset */
+ subentry = pgstat_get_subscription_entry(msg->m_subid, false);
+
+ /*
+ * Nothing to do if the given subscription entry is not found. This
+ * could happen when the subscription with the subid is removed and
+ * the corresponding statistics entry is also removed before receiving
+ * the reset message.
+ */
+ if (!subentry)
+ return;
+
+ /* Reset the stats for the requested replication slot */
+ pgstat_reset_subscription(subentry, ts);
+ }
+}
Why not reverse the if/else?
Checking OidIsValid(...) seems more natural than checking !OidIsValid(...)
~~~
9. src/backend/postmaster/pgstat.c - pgstat_recv_subscription_purge
static void
pgstat_recv_subscription_purge(PgStat_MsgSubscriptionPurge *msg, int len)
{
/* Return if we don't have replication subscription statistics */
if (subscriptionStatHash == NULL)
return;
/* Remove from hashtable if present; we don't care if it's not */
(void) hash_search(subscriptionStatHash, (void *) &(msg->m_subid),
HASH_REMOVE, NULL);
}
SUGGESTION
Wouldn't the above code be simpler written like:
if (subscriptionStatHash)
{
/* Remove from hashtable if present; we don't care if it's not */
(void) hash_search(subscriptionStatHash, (void *) &(msg->m_subid),
HASH_REMOVE, NULL);
}
~~~
10. src/backend/replication/logical/worker.c
(from my previous [Peter-v1] #9)
>> + /* Report the worker failed during table synchronization */
>> + pgstat_report_subscription_error(MyLogicalRepWorker->subid, false);
>>
>> and
>>
>> + /* Report the worker failed during the application of the change */
>> + pgstat_report_subscription_error(MyLogicalRepWorker->subid, true);
>>
>>
>> Why don't these use MySubscription->oid instead of MyLogicalRepWorker->subid?
> It's just because we used to use MyLogicalRepWorker->subid, is there
> any particular reason why we should use MySubscription->oid here?
I felt MySubscription->oid is a more natural and more direct way of
expressing the same thing.
Consider: "the oid of the current subscription" versus "the oid of
the subscription of the current worker". IMO the first one is simpler.
YMMV.
Also, it is shorter :)
~~~
11. src/include/pgstat.h - enum order
(follow-on from my previous v1 review comment #10)
>I assume you're concerned about binary compatibility or something. I
>think it should not be a problem since both
>PGSTAT_MTYPE_SUBWORKERERROR and PGSTAT_MTYPE_SUBSCRIPTIONPURGE are
>introduced to PG15.
Yes, maybe it is OK for those ones. But now in v2 there is a new
PGSTAT_MTYPE_RESETSUBCOUNTER.
Shouldn't at least that one be put at the end for the same reason?
~~~
12. src/include/pgstat.h - PgStat_MsgResetsubcounter
Maybe a better name for this is "PgStat_MsgResetsubcounters" (plural)?
~~~
13. src/test/subscription/t/026_worker_stats.pl - missing test?
Shouldn't there also be some test to reset the counters to confirm
that they really do get reset to zero?
~~~
14. src/tools/pgindent/typedefs.list
PgStat_MsgResetsubcounter (from pgstat.h) is missing?
------
15. pg_stat_subscription_activity view name?
Has the view name already been decided or still under discussion - I
was not sure?
If is it already decided then fine, but if not then my vote would be
for something different like:
e.g.1 - pg_stat_subscription_errors
e.g.2 - pg_stat_subscription_counters
e.g.3 - pg_stat_subscription_metrics
Maybe "activity" was chosen to be deliberately vague in case some
future unknown stats columns get added? But it means now there is a
corresponding function "pg_stat_reset_subscription_activity", when in
practice you don't really reset activity - what you want to do is
reset some statistics *about* the activity... so it all seems a bit
odd to me.
------
[Tang-v2] https://www.postgresql.org/message-id/OS0PR01MB6113769B17E90ADC9ACA14B2FB3D9%40OS0PR01MB6113.jpnprd01.prod.outlook.com
[Peter-v1] https://www.postgresql.org/message-id/CAHut%2BPtH-uN5rbGRh-%3DkCd8xvQYDf_JCcjLcVjW3OXGz6T%2BxCw%40mail.gmail.com
Kind Regards,
Peter Smith.
Fujitsu Australia
From | Date | Subject | |
---|---|---|---|
Next Message | Aleksander Alekseev | 2022-02-24 08:56:36 | Re: Support logical replication of DDLs |
Previous Message | Kyotaro Horiguchi | 2022-02-24 08:27:03 | Re: [BUG] Panic due to incorrect missingContrecPtr after promotion |