Re: BUG #17804: Assertion failed in pg_stat after fetching from pg_stat_database and switching cache->snapshot

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: exclusion(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #17804: Assertion failed in pg_stat after fetching from pg_stat_database and switching cache->snapshot
Date: 2023-05-01 05:18:00
Message-ID: ZE9LiFc7JdNHokz/@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Fri, Apr 28, 2023 at 04:07:52PM +0900, Kyotaro Horiguchi wrote:
> For the the record, I'm not saying that it is dangerous to clear
> snapshots directly in the callback. In fact, as I mentioned earlier, I
> believe there is no issue with that. But, I believe it is simpler
> that the actual work is separate from the callback path since we don't
> need to worry about when the guc-callback will be called.

pgstat_clear_backend_activity_snapshot() and pgstat_assert_is_up() are
the two points of contention here. The former could be called in a
non-backend context, which would be an incorrect use of this API. I
am not completely sure that pgstat_assert_is_up() would be always
correct, as well, though I have not been able to see a problem.

Anyway, sorry for the late replay, it took me some time to study this
subsystem and understand what's going on here (like why we have fixed
snapshots but these don't list a timestamp, for example). At the end,
I think that I'm OK with your suggestion to just force a snapshot
cleanup each time the GUC is changed, mainly because that's simple to
understand, and do the clear based on the timing where the next
snapshot build would happen. I would be curious to hear arguments if
there would be a need for a more complex mechanism, but that seems
overkill to me as this is mainly here when switching out of the
"snapshot" mode. A commit/abort would force a cleanup of the
snapshot, clearing the flag, as well.

I would add a small note in the docs about this behavior. Another
thing is to add a few pg_stat_get_snapshot_timestamp() after building
a snapshot for each mode tested.

While on it, I have noticed that postgresql.conf.sample does not list
the values available for stats_fetch_consistency. These had better be
added to the sample file, no?

At the end, I am finishing with the attached, without the sample file
part. Thoughts?
--
Michael

Attachment Content-Type Size
v3-0001-Fix-assertion-failure-on-updates-of-stats_fetch_c.patch text/x-diff 8.1 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Michael Paquier 2023-05-01 05:25:50 Re: pg_basebackup: errors on macOS on directories with ".DS_Store" files
Previous Message Maxim Boguk 2023-05-01 04:46:02 Re: BUG #17913: alter column set (n_distinct=...) on partition head doesn't work for declarative partitioned tables