Re: On a subscriber, why is last_msg_send_time in pg_stat_subscription sometimes null?

From: Koen De Groote <kdg(dot)dev(at)gmail(dot)com>
To: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
Cc: PostgreSQL General <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: On a subscriber, why is last_msg_send_time in pg_stat_subscription sometimes null?
Date: 2024-08-23 19:31:51
Message-ID: CAGbX52GSTNeOiDb7L_OGs+6XyirU-uEJZ35c-dcjvX1xxm85=w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Adrian,

Thanks for pointing me at the source code.

Digging a bit, the view seems to eventually get its data on last msg send
time from here:
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/replication/logical/launcher.c;h=c566d50a072b92bd07f4179100275d0d0b1f4c7c;hb=HEAD#l1288

And in particular, this:

1332 if (worker.last_send_time == 0)
1333 nulls[5] = true;
1334 else
1335 values[5] = TimestampTzGetDatum(worker.last_send_time);
1336 if (worker.last_recv_time == 0)
1337 nulls[6] = true;
1338 else
1339 values[6] = TimestampTzGetDatum(worker.last_recv_time);

I don't have any knowledge of C, or postgres internals, so I may well be
wrong in what follows:

From the bit of comment you posted my impression is that this means there
are separate workers that each send their update, that is then reflected in
the output of the pg_stat_subscription table... many workers, but only 1
table to show metrics, to show both the update by the leader and the
parallel workers...

And these parallel workers get created on the fly, I assume? So they might
well have a last_send_time of 0 if they haven't done anything yet?

What I would expect to see is a table that tells me how a particular
publishers/subscriber is doing, and the metrics around that process, and
the concept of "when data was last sent" to be persistent on the level of
the publisher/subscriber, not based on the lifespan of ephemeral workers
that each time they get created start at 0 and so now the table claims
"last_send_msg" is NULL because of it.

Am I getting that wrong? Is my understanding mistaken?

Regards,
Koen De Groote

On Fri, Aug 23, 2024 at 5:02 PM Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
wrote:

> On 8/23/24 07:33, Koen De Groote wrote:
> > I'm running this query:
> >
> > SELECT subname, received_lsn, latest_end_lsn,
> > last_msg_send_time, last_msg_receipt_time
> > FROM pg_catalog.pg_stat_subscription;
> >
> > And "last_msg_send_time" will sometimes be null. All I can find in the
> > documentation is that this value represents: "Send time of last message
> > received from origin WAL sender; NULL for parallel apply workers"
> >
> > But I have no context for what parallel apply workers are or why they
> > are NULL sometimes and other times not. >
> > Can anyone explain?
>
> The source has that information:
>
>
> https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/replication/logical/applyparallelworker.c;h=e7f7d4c5e4bd3ff63687cb21b84534a05a3962cc;hb=HEAD
>
> The top part explains the process.
>
> As to the time:
>
> /*
> * There are three fields in each message received by the parallel apply
> * worker: start_lsn, end_lsn and send_time. Because we have updated these
> * statistics in the leader apply worker, we can ignore these fields in
> the
> * parallel apply worker (see function LogicalRepApplyLoop).
> */
>
> >
> > Regards,
> > Koen De Groote
>
> --
> Adrian Klaver
> adrian(dot)klaver(at)aklaver(dot)com
>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2024-08-23 20:14:31 Re: On a subscriber, why is last_msg_send_time in pg_stat_subscription sometimes null?
Previous Message Adrian Klaver 2024-08-23 16:53:15 Re: dead tuple difference between pgstattuple and pg_stat_user_tables