Re: BUG #17947: Combination of replslots pgstat issues causes error/assertion failure

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Floris Van Nee <florisvannee(at)optiver(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "exclusion(at)gmail(dot)com" <exclusion(at)gmail(dot)com>, "pgsql-bugs(at)lists(dot)postgresql(dot)org" <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #17947: Combination of replslots pgstat issues causes error/assertion failure
Date: 2024-06-04 05:25:23
Message-ID: Zl6lQ_K0-QOfqUS4@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On Sun, Jun 02, 2024 at 09:31:56PM +0000, Floris Van Nee wrote:

Your recent reply has attracted me here.

> It looks like pgstat_drop_entry expects callers to take care of
> this. All callers in pgstat_xact.c (AtEOXact_PgStat_DroppedStats for
> example) are dropping stats for several relations at once, and then
> at the end they call pgstat_request_entry_refs_gc once if
> needed. I'm guessing this was an optimization.

Note an extra edge case in pgstat_init_function_usage(), which should
do the same, I guess?

> However, pgstat_drop_replslot didn't get that memo. Attached patch
> adds a call to gc in pgstat_drop_replslot to bring it in line with
> usage in pgstat_xact.c. On my setup this fixes the issue that occurs
> on this thread.

I find this omission for replication slots quite confusing, as well,
as this is incorrect with what's documented in pgstat_internal.h. I
can get the point that it is important to document that on top of the
drop routine as well as that can be easily missed, though your
suggested comment could be simplified a bit more, perhaps.

>> I think there would be a leftover edge-case, namely that
>> pgstat_gc_entry_refs() doesn't gc entries with pending data. But we could
>> just do that - and it should be correct, afaict. I wonder if the reason we didn't
>> is that we introduced PgStat_KindInfo.delete_pending_cb later?
>
> Can you clarify why this would be safe to do? Looking at commit
> history, it looks like they were added in the same commit 5891c7a8e
> which introduced the shared memory stats. I can reason why it
> *should* be safe, but I can't find a discussion around why every
> single function except pgstat_drop_entry uses discard_pending=false
> if it would be safe to pass true everywhere. It seems strange that
> it'd have been chosen without reason.
>
> This does mean it's unfortunately unrelated to the issue I reported
> here
> https://www.postgresql.org/message-id/flat/20240505223546.6yvjzgqifuoiii3e%40awork3.anarazel.de#1dfe3ecf6a75ace833444bdc0d268f4a
> , because the issue in the current thread is fixed by changing a
> replication-slot call (which is not used in the database for which I
> reported it).

Yeah, It's clear that we don't have enough information to proceed
here. It seems to me that there is a good argument for addressing
this one first, separately of the other one.

PS: be careful with the indentation of your messages. You may want to
limit the number of characters per line to 78-ish characters or so.
--
Michael

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Amit Kapila 2024-06-04 05:28:52 Re: BUG #18486: Is there something wrong with the calculation in ReorderBufferChangeSize()?
Previous Message Michael Paquier 2024-06-04 05:04:00 Re: error "can only drop stats once" brings down database