Quick Links

Re: POC, WIP: OR-clause support for indexes

From:	Andrei Lepikhov <lepihov(at)gmail(dot)com>
To:	Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc:	Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Nikolay Shaplov <dhyan(at)nataraj(dot)su>, pgsql-hackers(at)lists(dot)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org, Peter Geoghegan <pg(at)bowt(dot)ie>, Marcos Pegoraro <marcos(at)f10(dot)com(dot)br>, teodor(at)sigaev(dot)ru, Peter Eisentraut <peter(at)eisentraut(dot)org>
Subject:	Re: POC, WIP: OR-clause support for indexes
Date:	2025-01-27 08:52:16
Message-ID:	88844a88-57dd-4722-b697-15c10f1b81a7@gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 1/25/25 12:04, Alexander Korotkov wrote:
> On Wed, Jan 15, 2025 at 10:24 AM Andrei Lepikhov <lepihov(at)gmail(dot)com> wrote:
>> causes SEGFAULT during index keys evaluation. I haven't dived into it
>> yet, but it seems quite a typical misstep and is not difficult to fix.
>
> Segfault appears to be caused by a typo. Patch used parent rinfo
> instead of child rinfo. Fixed in the attached patch.
Great!
>
> It appears that your first query also changed a plan after fixing
> this. Could you, please, provide another example of a regression for
> short-circuit optimization, which is related to this patch?
Yes, it may be caused by the current lazy InitPlan evaluation strategy,
which would only happen if it was really needed.

Examples:
---------

EXPLAIN (ANALYZE, COSTS OFF, BUFFERS OFF, TIMING OFF)
SELECT * FROM bitmap_split_or t1
WHERE t1.a=2 AND (t1.b=2 OR t1.b = (
SELECT avg(x) FROM generate_series(1,1e6) AS x)::integer);

without optimisation:

Index Scan using t_a_b_idx on bitmap_split_or t1 (actual rows=1 loops=1)
Index Cond: (a = 2)
Filter: ((b = 2) OR (b = ((InitPlan 1).col1)::integer))
InitPlan 1
-> Aggregate (never executed)
-> Function Scan on generate_series x (never executed)
Planning Time: 0.564 ms
Execution Time: 0.182 ms

But having it as a part of an array, we forcedly evaluate it for (not
100% sure) more precise selectivity estimation:

Index Scan using t_a_b_idx on bitmap_split_or t1
(actual rows=1 loops=1)
Index Cond: ((a = 2) AND
(b = ANY (ARRAY[2, ((InitPlan 1).col1)::integer])))
InitPlan 1
-> Aggregate (actual rows=1 loops=1)
-> Function Scan on generate_series x
(actual rows=1000000 loops=1)
Planning Time: 0.927 ms
Execution Time: 489.933 ms

This also means that if, before the patch, we executed a query
successfully, after applying the patch, we sometimes may get the error:
'ERROR: more than one row returned by a subquery used as an expression'
because of early InitPlan evaluation. See the example below:

EXPLAIN (ANALYZE, COSTS OFF)
SELECT * FROM bitmap_split_or t1
WHERE t1.a=2 AND (t1.b=2 OR t1.b = (
SELECT random() FROM generate_series(1,1e6) AS x)::integer);

Index Scan using t_a_b_idx on bitmap_split_or t1
Index Cond: ((a = 2) AND (b = ANY (ARRAY[2, ((InitPlan
1).col1)::integer])))
InitPlan 1
-> Function Scan on generate_series x

I think optimisation should have never happened and this is another
issue, isn't it?

--
regards, Andrei Lepikhov

In response to

Re: POC, WIP: OR-clause support for indexes at 2025-01-25 05:04:57 from Alexander Korotkov

Responses

Re: POC, WIP: OR-clause support for indexes at 2025-01-27 09:50:09 from Alexander Korotkov

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Jelte Fennema-Nio	2025-01-27 08:54:04	Re: New process of getting changes into the commitfest app
Previous Message	jian he	2025-01-27 08:16:26	Re: Emitting JSON to file using COPY TO