From: | Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru> |
---|---|
To: | Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
Cc: | Andrei Lepikhov <lepihov(at)gmail(dot)com>, Ivan Kush <ivan(dot)kush(at)tantorlabs(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> |
Subject: | Re: Replace IN VALUES with ANY in WHERE clauses during optimization |
Date: | 2025-04-03 14:18:46 |
Message-ID: | 6865d6c1-8bb0-4528-af96-f62220b67f27@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 02.04.2025 22:00, Alexander Korotkov wrote:
> Hi, Alena!
>
> On Wed, Apr 2, 2025 at 5:33 PM Alena Rybakina
> <a(dot)rybakina(at)postgrespro(dot)ru> wrote:
>
> I prepared a patch according to my suggestions, it just checks
> that the transformation is not carried out if there is a var
> element, there are changes only in one test, but I think it is
> correct.
>
> diff -U3
> /home/alena/postgrespro_or3/src/test/regress/expected/subselect.out
> /home/alena/postgrespro_or3/src/test/regress/results/subselect.out
> ---
> /home/alena/postgrespro_or3/src/test/regress/expected/subselect.out
> 2025-04-02 02:50:07.018329864 +0300
> +++
> /home/alena/postgrespro_or3/src/test/regress/results/subselect.out
> 2025-04-02 17:27:09.845104001 +0300
> @@ -3027,18 +3027,15 @@
> SELECT ten FROM onek t WHERE unique1 IN (VALUES (0), ((2 IN
> (SELECT (3)))::integer)
> );
> - QUERY PLAN
> -----------------------------------------------------
> - Nested Loop
> - -> Unique
> - -> Sort
> - Sort Key: "*VALUES*".column1
> - -> Values Scan on "*VALUES*"
> - SubPlan 1
> - -> Result
> - -> Index Scan using onek_unique1 on onek t
> - Index Cond: (unique1 = "*VALUES*".column1)
> -(9 rows)
> + QUERY PLAN
> +------------------------------------------------------------------------------------------------
> + Bitmap Heap Scan on onek t
> + Recheck Cond: (unique1 = ANY (ARRAY[0, ((ANY (2 = (hashed
> SubPlan 1).col1)))::integer]))
> + -> Bitmap Index Scan on onek_unique1
> + Index Cond: (unique1 = ANY (ARRAY[0, ((ANY (2 = (hashed
> SubPlan 1).col1)))::integer]))
> + SubPlan 1
> + -> Result
> +(6 rows)
>
> -- Alow to transformation and hold conversion between types of
> colemns and
> -- declared type of column pointed in RTE
>
> Your patch turns back performance degradation issue that I mention in [1].
>
> With your patch.
> # explain analyze select * from test where val in (VALUES (5000),
> (4000), (9000), (2000), (1000), (140050), ((select 1)));
> QUERY PLAN
> -------------------------------------------------------------------------------------------------------
> Seq Scan on test (cost=0.01..23175.01 rows=7 width=4) (actual
> time=0.455..417.929 rows=7.00 loops=1)
> Filter: (val = ANY (ARRAY[5000, 4000, 9000, 2000, 1000, 140050,
> (InitPlan 1).col1]))
> Rows Removed by Filter: 999993
> Buffers: shared hit=2362 read=2063
> InitPlan 1
> -> Result (cost=0.00..0.01 rows=1 width=4) (actual
> time=0.002..0.002 rows=1.00 loops=1)
> Planning Time: 0.211 ms
> Execution Time: 417.984 ms
> (8 rows)
>
> Without your patch.
> # explain analyze select * from test where val in (VALUES (5000),
> (4000), (9000), (2000), (1000), (140050), ((select 1)));
> QUERY PLAN
> ------------------------------------------------------------------------------------------------------------------------
> Hash Semi Join (cost=0.19..17050.26 rows=7 width=4) (actual
> time=0.289..210.335 rows=7.00 loops=1)
> Hash Cond: (test.val = "*VALUES*".column1)
> Buffers: shared hit=2211 read=2214
> InitPlan 1
> -> Result (cost=0.00..0.01 rows=1 width=4) (actual
> time=0.002..0.002 rows=1.00 loops=1)
> -> Seq Scan on test (cost=0.00..14425.00 rows=1000000 width=4)
> (actual time=0.203..85.925 rows=1000000.00 loops=1)
> Buffers: shared hit=2211 read=2214
> -> Hash (cost=0.09..0.09 rows=7 width=4) (actual
> time=0.067..0.068 rows=7.00 loops=1)
> Buckets: 1024 Batches: 1 Memory Usage: 9kB
> -> Values Scan on "*VALUES*" (cost=0.00..0.09 rows=7
> width=4) (actual time=0.004..0.023 rows=7.00 loops=1)
> Planning Time: 0.321 ms
> Execution Time: 210.409 ms
> (12 rows)
>
> Only when we have all consts in the array, we can have SAOP hashing
> which is competitive with hashing of VALUES node. So, I'm continuing
> with my version of patch.
>
> Links.
> 1.
> https://www.postgresql.org/message-id/CAPpHfds-7eJ3ZMgyeVrMfC5E1nTHD4Bp0ch5MZhrYSoiCfERXw%40mail.gmail.com
>
>
Okay, I agree with you.
--
Regards,
Alena Rybakina
Postgres Professional
From | Date | Subject | |
---|---|---|---|
Next Message | Aleksander Alekseev | 2025-04-03 14:25:58 | Re: [PATCH] Refactor SLRU to always use long file names |
Previous Message | Robert Haas | 2025-04-03 14:10:26 | Re: RFC: Logging plan of the running query |