Re: second CTE kills perf

From: Vijaykumar Jain <vijaykumarjain(dot)github(at)gmail(dot)com>
To: Nicolas Seinlet <nicolas(at)seinlet(dot)com>
Cc: "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: second CTE kills perf
Date: 2021-06-22 11:17:49
Message-ID: CAM+6J97DzM+__cemAZE83zC6+y17EjBvk-hzVZQ4dNVLk7krVA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, 22 Jun 2021 at 13:50, Nicolas Seinlet <nicolas(at)seinlet(dot)com> wrote:

> Hello,
>
> oversimplified example:
> 10 seconds version:
> | WITH cte1 AS (SELECT x,y,z FROM table) SELECT row_number() over(),x,y,z
> FROM cte1 WHERE x=32;
>
> 10 minutes version:
> | WITH cte1 AS (SELECT x,y,z FROM table), cte2 AS (SELECT row_number()
> over(),x,y,z FROM cte1) SELECT * FROM cte2 WHERE x=32;
>
>
A simplified setup of the above issue.
I was able to reproduce this
postgres=# -- create table t(id int primary key, name text);
postgres=# -- insert into t select x, x::text from generate_series(1,
1000000) x;

--pushdown
postgres=# explain (analyze,buffers) with cte as (select * from t) select
*, row_number() over () from cte where id = 10;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
WindowAgg (cost=0.42..8.46 rows=1 width=18) (actual time=0.014..0.016
rows=1 loops=1)
Buffers: shared hit=4
-> Index Scan using t_pkey on t (cost=0.42..8.44 rows=1 width=10)
(actual time=0.010..0.011 rows=1 loops=1)
Index Cond: (id = 10)
Buffers: shared hit=4
Planning Time: 0.074 ms
Execution Time: 0.029 ms
(7 rows)

--no pushdown
postgres=# explain (analyze,buffers) with cte as (select * from t), cte2 as
(select *, row_number() over () from cte) select * from cte2 where id = 10;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Subquery Scan on cte2 (cost=0.00..40405.00 rows=1 width=18) (actual
time=0.017..224.461 rows=1 loops=1)
Filter: (cte2.id = 10)
Rows Removed by Filter: 999999
Buffers: shared hit=609 read=4796
-> WindowAgg (cost=0.00..27905.00 rows=1000000 width=18) (actual
time=0.012..185.554 rows=1000000 loops=1)
Buffers: shared hit=609 read=4796
-> Seq Scan on t (cost=0.00..15405.00 rows=1000000 width=10)
(actual time=0.007..45.168 rows=1000000 loops=1)
Buffers: shared hit=609 read=4796
Planning Time: 0.068 ms
Execution Time: 224.479 ms
(10 rows)

-- without aggregate, pushdown works even with multiple ctes
(analyze,buffers) with cte as (select * from t), cte2 as (select * from cte
where id < 100) select * from cte2 where id = 10;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------
Index Scan using t_pkey on t (cost=0.42..8.45 rows=1 width=10) (actual
time=0.005..0.006 rows=1 loops=1)
Index Cond: ((id < 100) AND (id = 10))
Buffers: shared hit=4
Planning:
Buffers: shared hit=4
Planning Time: 0.074 ms
Execution Time: 0.015 ms
(7 rows)

--with window aggregate, even at the top cte, predicate is not applied
explain (analyze,buffers) with cte as (select *, row_number() over () from
t), cte2 as (select * from cte where id < 100) select * from cte2 where id
= 10;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
Subquery Scan on cte (cost=0.00..42905.00 rows=1 width=18) (actual
time=0.013..226.454 rows=1 loops=1)
Filter: ((cte.id < 100) AND (cte.id = 10))
Rows Removed by Filter: 999999
Buffers: shared hit=673 read=4732
-> WindowAgg (cost=0.00..27905.00 rows=1000000 width=18) (actual
time=0.009..187.550 rows=1000000 loops=1)
Buffers: shared hit=673 read=4732
-> Seq Scan on t (cost=0.00..15405.00 rows=1000000 width=10)
(actual time=0.005..44.613 rows=1000000 loops=1)
Buffers: shared hit=673 read=4732
Planning Time: 0.055 ms
Execution Time: 226.468 ms

--without cte predicate is applied before window aggregate ?
postgres=# explain (analyze,buffers) select *, row_number() over () from t
where id = 10;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
WindowAgg (cost=0.42..8.46 rows=1 width=18) (actual time=0.018..0.020
rows=1 loops=1)
Buffers: shared hit=4
-> Index Scan using t_pkey on t (cost=0.42..8.44 rows=1 width=10)
(actual time=0.013..0.014 rows=1 loops=1)
Index Cond: (id = 10)
Buffers: shared hit=4
Planning Time: 0.053 ms
Execution Time: 0.037 ms
(7 rows)

Thank you for raising this, I hope i'll gain something here.

--
Thanks,
Vijay
Mumbai, India

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2021-06-22 13:29:28 Re: cpu-intensive immutable function and parallel scan
Previous Message Oliver Kohll 2021-06-22 09:19:57 Re: replace inside regexp_replace