Re: TRAP: FailedAssertion("HaveRegisteredOrActiveSnapshot()", File: "toast_internals.c", Line: 670, PID: 19403)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Erik Rijkers <er(at)xs4all(dot)nl>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: TRAP: FailedAssertion("HaveRegisteredOrActiveSnapshot()", File: "toast_internals.c", Line: 670, PID: 19403)
Date: 2022-04-18 17:20:54
Message-ID: 2359643.1650302454@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> I wasn't really taking a position either way about timing. If we can
> demonstrate that things other than HaveRegisteredOrActiveSnapshot()
> itself are misbehaving, then I think fixes for those bugs are
> potentially back-patchable no matter where we are in the release
> cycle,

Sure, but ...

> but in terms of when we make changes to try to detect bugs we
> don't know about yet, I could go either way on whether to do that now
> or wait. We can't know whether the bugs we haven't found yet will
> cause a big problem for someone tomorrow, ten years from now, or
> never.

... I think in this case we do have a pretty good idea of the possible
consequences. Most of the time, an unsafe toast fetch will work
fine because the toast data is still there. If you're very unlucky
then it's been deleted, and vacuumed away, and then you get a "missing
chunk number" error. If you're really astronomically unlucky, perhaps
the toast OID has been recycled and you get the wrong data (it's not
clear to me whether the toast snapshot visibility rules would prevent
this). I doubt we need to factor that last scenario into practical risk
estimates, though. So adding a non-assert check for snapshot misuse
would effectively convert "if you're very unlucky you get a weird error"
to "lucky or not, you get some other weird error", which no user is
going to see as an improvement.

> I am not really very happy about HaveRegisteredOrActiveSnapshot(),
> honestly.

Me either. If we find any other cases where it gives a false positive,
I'll be for removing it rather than fixing it. But for the moment
I'm content to leave it until we have a well-engineered solution to
the real problem.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2022-04-18 17:28:40 Re: Postgres perl module namespace
Previous Message John Naylor 2022-04-18 16:55:06 Re: subscribe hackers