Re: TRAP: FailedAssertion("HaveRegisteredOrActiveSnapshot()", File: "toast_internals.c", Line: 670, PID: 19403)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Erik Rijkers <er(at)xs4all(dot)nl>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: TRAP: FailedAssertion("HaveRegisteredOrActiveSnapshot()", File: "toast_internals.c", Line: 670, PID: 19403)
Date: 2022-04-18 15:26:02
Message-ID: 2276589.1650295562@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> I agree that it's a little unclear. In general, I think if we're going
> to blow up and die, doing it closer to the place where the problem is
> happening is for the best. On the other hand, if in most practical
> cases we're going to stumble through and get the right answer anyway,
> then it's maybe not great to break a bunch of accidentally-working
> cases. However, it does strikes me that this principal could easily be
> overdone. init_toast_snapshot() could pick a random snapshot (or take
> a new one) in order to call InitToastSnapshot() and that would often
> work fine. Yet, upon realizing that things are busted, it chooses to
> error out instead. I approve of that choice, and don't think we should
> rule out the idea of making that check more robust.

I'm all for improving robustness, but failing in cases that would have
worked before (even if only accidentally) is not going to be seen by
users as more robust. I think that this late stage of the development
cycle is not the time to be putting in changes that are not actually
going to fix bugs but only call greater attention to the possibility
that a bug exists.

TBH, given where we are in the dev cycle, I thought there was a lot of
sense behind your earlier thought that HaveRegisteredOrActiveSnapshot
should be reverted entirely. I'm okay with keeping it as an assertion-
only check, so that it won't bother end users. I'm not okay with
adding end-user-visible failures, at least not till early in v16.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Matthias van de Meent 2022-04-18 15:48:50 Re: Non-replayable WAL records through overflows and >MaxAllocSize lengths
Previous Message Stephen Frost 2022-04-18 15:18:00 Re: pg_walcleaner - new tool to detect, archive and delete the unneeded wal files (was Re: pg_archivecleanup - add the ability to detect, archive and delete the unneeded wal files on the primary)