From: | Mathias Kunter <mathiaskunter(at)gmail(dot)com> |
---|---|
To: | David Rowley <dgrowleyml(at)gmail(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: BUG #17964: Missed query planner optimization |
Date: | 2023-06-07 11:47:59 |
Message-ID: | 94608857-707f-55ce-99a0-ac500aeb06f8@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
> It's not a bug that the planner does not consider evaluating the join
> before the UNION
Yes, it's not a bug, but it's something which can be improved. If I
simply change the original query from this:
> SELECT ... WHERE id IN (SELECT ...);
into this:
> SELECT ... WHERE id = ANY(ARRAY(SELECT ...));
then Postgres uses an index scan, and the query is orders of magnitude
faster. Note that the planner actually correctly computes the estimated
costs for both variants, since I get:
> cost=769.11..1227.17 when using IN
> cost=86.45..86.65 when using ANY
See https://dbfiddle.uk/iOkiiTJJ
Also note that this issue doesn't only affect UNION queries. For
example, the following query will also execute orders of magnitude
faster if I simply replace IN with ANY:
> SELECT * FROM t WHERE x = 'a' OR y IN (SELECT ...);
Again, estimated costs say that using ANY should be faster:
> cost=8.30..2443.31 when using IN
> cost=56.45..350.69 when using ANY
See https://dbfiddle.uk/b9piwQr4
Hence, why doesn't the planner simply test whether it's beneficial to
replace IN with ANY? It seems that all which has to be done is to
compare the query plans for both possible execution variants. I guess
this should be rather simple to implement, isn't it?
Thanks
Mathias
Am 06.06.23 um 23:32 schrieb David Rowley:
> On Wed, 7 Jun 2023 at 04:44, PG Bug reporting form
> <noreply(at)postgresql(dot)org> wrote:
>> In the example below, the query planner uses a sequential scan (query 1)
>> even though it could use an index scan (query 2).
>>
>> EXPLAIN ANALYZE SELECT id, name FROM (SELECT id, name FROM table1 UNION
>> SELECT id, name FROM table2) AS q
>> WHERE id IN (SELECT id FROM table3);
>
>> EXPLAIN ANALYZE SELECT id, name FROM (SELECT id, name FROM table1 UNION
>> SELECT id, name FROM table2) AS q
>> WHERE id IN (1538,8836,5486,3464,2673);
>
> It's not a bug that the plannSer does not consider evaluating the join
> before the UNION, it's just an optimisation opportunity we don't
> currently explore.
>
> If you want that, then write:
>
> EXPLAIN ANALYZE SELECT id, name FROM table1 WHERE id IN (SELECT id
> FROM table3) UNION SELECT id, name FROM table2 WHERE id IN (SELECT id
> FROM table3);
>
> David
From | Date | Subject | |
---|---|---|---|
Next Message | David Rowley | 2023-06-07 12:36:03 | Re: BUG #17964: Missed query planner optimization |
Previous Message | PG Bug reporting form | 2023-06-07 09:57:49 | BUG #17967: [msvc-x86] Failed to build with error LNK2019: unresolved external symbol __BitScanReverse64 referen |