Re: BUG #18831: Particular queries using gin-indexes are not interruptible, resulting is resource usage concerns.

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: niek(dot)brasa(at)hitachienergy(dot)com
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #18831: Particular queries using gin-indexes are not interruptible, resulting is resource usage concerns.
Date: 2025-03-05 20:50:02
Message-ID: 474911.1741207802@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

PG Bug reporting form <noreply(at)postgresql(dot)org> writes:
> As part of checking why some of the queries were not interruptible (even
> though statement_timeout was set), I was able to reduce it to something
> generic, as the usage of a gin index and the ?| operator.

Thanks for the report. It looks like the issue boils down to trying
to do a GIN index search with a very large array on the RHS of the
?| operator. extractQuery extracts all the array elements as search
keys, and then we spend O(N^2) time in a very stupid de-duplication
loop in ginFillScanEntry. Also, if we get past that, there's O(N^2)
work in startScanKey.

It's not hard to remove the O(N^2) behavior in ginFillScanEntry:
the de-duplication is just an optimization AFAICS, and we can stop
doing it once there are too many keys. A more principled approach
would be to sort the keys and then just compare adjacent ones to
de-dup, but I seriously doubt it's worth the extra code that
would be needed.

Half of the problem in startScanKey is just wasted work: it
re-initializes the entire entryRes[] array for each required-key
probe, when it actually would take less code to incrementally
update the array. However, the other half is that we are doing
up to nentries calls of the triConsistentFn, which may be doing
O(nentries) work itself per call --- and indeed is doing so in
this specific case. I don't see any solution for that that
doesn't involve changes in the API spec for GIN opclasses.
It does seem like a pretty bad way to be doing things for
operators that have simple AND or OR semantics, where we could
know the answer in constant time if we only had an API for it.
So maybe it's worth trying to do something about that, but don't
hold your breath.

For now I propose the attached, which bounds ginFillScanEntry's
de-dup efforts at 1000 keys, fixes the silly coding in startScanKey,
and adds a CHECK_FOR_INTERRUPTS in the startScanKey loop to cover
the fact that it's still O(N^2) in the worst case. With this
I don't notice any objectionable delay in response to cancel
interrupts. Also, while it's still O(N^2) in the end,
there's a pretty nice speedup for mid-size problems:
NOTICE: Query for 10000/ 200000 took 6.957140 s
vs
NOTICE: Query for 10000/ 200000 took 649.409895 s

regards, tom lane

Attachment Content-Type Size
fix-some-gin-performance-problems.patch text/x-diff 1.7 KB

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2025-03-06 01:16:57 BUG #18832: Segfault in GrantLockLocal
Previous Message PG Bug reporting form 2025-03-05 17:49:38 BUG #18831: Particular queries using gin-indexes are not interruptible, resulting is resource usage concerns.