From: | Tomas Vondra <tomas(at)vondra(dot)me> |
---|---|
To: | Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
Cc: | Tobias Wendorff <tobias(dot)wendorff(at)tu-dortmund(dot)de>, pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BRIN index creation on geometry column causes crash |
Date: | 2025-02-12 18:12:20 |
Message-ID: | b32735cc-317f-4da6-b76b-e30398f4e94e@vondra.me |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On 2/12/25 11:55, Álvaro Herrera wrote:
> On 2025-Jan-03, Tomas Vondra wrote:
>
>> Reproduced, but I belive this is actually a long-standing PostGIS bug.
>> The exact place where it crashes is here:
>>
>> /* Finally, merge B to A. */
>> finfo = inclusion_get_procinfo(bdesc, attno, PROCNUM_MERGE);
>> Assert(finfo != NULL);
>> result = FunctionCall2Coll(finfo, colloid, ...);
>>
>> With asserts, it fails on the assert, i.e. finfo is NULL. This means the
>> opfamily is missing the "MERGE" support procedure, which is however
>> required (from the very beginning of BRIN in 9.5).
>
> Hmm, I completely agree that this is a PostGIS bug, but at the same time
> I think it's wrong for Postgres not to verify that the support function
> exists, so that we can throw an error instead of crashing. I think the
> most expedient way to implement this is to add a "missing_ok" argument
> to the various "foo_get_procinfo" functions, and make to pass it true
> only in cases where the usage of the return value already admit that it
> could be a null pointer (meaning it's really an optional support proc).
>
> This should avoid further crashes with invalidly defined opclasses; see
> attached POC patch. The user is still in a bad place, because for
> instance if it's autovacuum that tries to do the range merging, then
> vacuuming will fail but people may not notice for months unless they're
> paying careful attention to the server logs. Crashing is good (tm)
> because now a bunch of people are aware that the opclass is broken.
>
Yeah, I'm not opposed to doing this. I don't know what's the right level
of paranoia when dealing with user-defined stuff.
But if I get this right, this is merely a mitigation turning a crash
into ERROR, it would not help with detecting the opclass brokenness any
earlier. Until the parallel builds it was very rare to need the MERGE.
> My next question is why didn't brinvalidate detect this problem. Is it
> just that nobody ran 'amvalidate' on the opclass, or is it that
> brinvalidate does not know that it needs to require the merge proc? But
> I'll leave that for later.
>
I don't know. Either no one tried running it, or maybe it wasn't quite
clear which of the reported issues are serious and which can be ignored.
I certainly wasn't quite clear to me when I tried running it, because
all the messages are INFO, it doesn't say which functions are missing,
and at least some of it seemed to be due to cross-type operators.
See this for details:
https://lists.osgeo.org/pipermail/postgis-devel/2025-January/030457.html
>
> Now, in writing this patch I noticed that both brin_bloom.c and
> brin_minmax_multi.c seem to have cargo-culted the idea that support
> procs can be optional; both of them have a single support proc
> (PROCNUM_HASH and PROCNUM_DISTANCE, respectively) which is not optional;
> so the "extra_proc_missing" stuff seems unnecessary for them. And
> therefore the new missing_ok support I introduce in this patch would
> also be unnecessary. As far as I can tell, there are no ABI concerns
> about removing opaque->extra_proc_missing[] from both these opclass
> scaffolds (and reintroducing it later, if we determine that we do need
> optional support procs in those scaffolds after all.)
>
Yeah, copy-paste is my basic coding tool, so it's quite possible the
bloom and minmax_multi could do stuff in a simpler way.
I can't think of a reason why would removing extra_proc_missing[] break
ABI, but I don't quite see the need to change this in backbranches. And
if we want to stop doing unnecessary stuff, maybe it'd be enough to just
remove the code, but leave extra_proc_missing in the struct.
regards
--
Tomas Vondra
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Langote | 2025-02-13 02:39:08 | Re: BUG #18806: When enable_rartitionwise_join is set to ON, the database shuts down abnormally |
Previous Message | Tom Lane | 2025-02-12 17:05:13 | Re: BUG #18807: libcrypto.so.1.1: cannot open shared object file |