From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Andres Freund <andres(at)anarazel(dot)de>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, Erik Rijkers <er(at)xs4all(dot)nl>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: TRAP: FailedAssertion("HaveRegisteredOrActiveSnapshot()", File: "toast_internals.c", Line: 670, PID: 19403) |
Date: | 2022-04-18 16:50:31 |
Message-ID: | CA+Tgmoa1+hJ3FieSugwYjAoEpM+pUm7EyAQbzzUBDcjjkX=kRw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Apr 18, 2022 at 11:26 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> > I agree that it's a little unclear. In general, I think if we're going
> > to blow up and die, doing it closer to the place where the problem is
> > happening is for the best. On the other hand, if in most practical
> > cases we're going to stumble through and get the right answer anyway,
> > then it's maybe not great to break a bunch of accidentally-working
> > cases. However, it does strikes me that this principal could easily be
> > overdone. init_toast_snapshot() could pick a random snapshot (or take
> > a new one) in order to call InitToastSnapshot() and that would often
> > work fine. Yet, upon realizing that things are busted, it chooses to
> > error out instead. I approve of that choice, and don't think we should
> > rule out the idea of making that check more robust.
>
> I'm all for improving robustness, but failing in cases that would have
> worked before (even if only accidentally) is not going to be seen by
> users as more robust. I think that this late stage of the development
> cycle is not the time to be putting in changes that are not actually
> going to fix bugs but only call greater attention to the possibility
> that a bug exists.
>
> TBH, given where we are in the dev cycle, I thought there was a lot of
> sense behind your earlier thought that HaveRegisteredOrActiveSnapshot
> should be reverted entirely. I'm okay with keeping it as an assertion-
> only check, so that it won't bother end users. I'm not okay with
> adding end-user-visible failures, at least not till early in v16.
I wasn't really taking a position either way about timing. If we can
demonstrate that things other than HaveRegisteredOrActiveSnapshot()
itself are misbehaving, then I think fixes for those bugs are
potentially back-patchable no matter where we are in the release
cycle, but in terms of when we make changes to try to detect bugs we
don't know about yet, I could go either way on whether to do that now
or wait. We can't know whether the bugs we haven't found yet will
cause a big problem for someone tomorrow, ten years from now, or
never.
I am not really very happy about HaveRegisteredOrActiveSnapshot(),
honestly. I think in the form we have it in the tree it looks
under-engineered. It's not really testing the right thing (even
leaving aside the bug fix) as you have been pointing out, it doesn't
really mesh well with the sanity checking that was there before as I
have been pointing out, and it's only used in one place. I wouldn't be
sad if it got reverted. However, I don't think it's going to do us any
great harm, either. Although it's a long way from the best thing we
have in the tree, it's also a long way from the worst thing we have in
the tree.
--
Robert Haas
EDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | John Naylor | 2022-04-18 16:55:06 | Re: subscribe hackers |
Previous Message | Noah Misch | 2022-04-18 15:52:24 | Re: Postgres perl module namespace |