From: | shawn wang <shawn(dot)wang(dot)pg(at)gmail(dot)com> |
---|---|
To: | Wenjing <wenjing(dot)zwj(at)alibaba-inc(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, wjzeng <wjzeng2012(at)gmail(dot)com> |
Subject: | Re: Is it worth pushing conditions to sublink/subplan? |
Date: | 2021-09-01 02:54:35 |
Message-ID: | CA+T=_GW5JX-4EFb9iujv_ULpW1HjxmKYkjjMA1iq3fGa8FL4bQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I tested it the way you said and increased the number of sub-tables.
I created a hash partition table of 1000 sub-tables.
Test according to your first SQL, the optimizer cuts the unnecessary
sub-tables well.
You can see the plan:
postgres=# explain analyze
postgres-# select temp.p1 from
postgres-# (
postgres(# select p1,p2 from test1.test1hashtable x where x.p1 = '1'
postgres(# union all
postgres(# (values('1','1'))
postgres(# ) temp,
postgres-# test1.test1hashtable y
postgres-# where y.p2 = temp.p2 and y.p1 = '1' and y.p1='1';
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Nested Loop (cost=0.00..25.55 rows=1 width=32) (actual time=0.004..0.004
rows=0 loops=1)
Join Filter: (x.p2 = y.p2)
-> Seq Scan on test1hashtable826 y (cost=0.00..12.75 rows=1 width=32)
(actual time=0.002..0.002 rows=0 loops=1)
Filter: (p1 = '1'::text)
-> Append (cost=0.00..12.78 rows=2 width=64) (never executed)
-> Seq Scan on test1hashtable826 x (cost=0.00..12.75 rows=1
width=64) (never executed)
Filter: (p1 = '1'::text)
-> Result (cost=0.00..0.01 rows=1 width=64) (never executed)
Planning Time: 0.158 ms
Execution Time: 0.022 ms
(10 rows)
But when the second one runs, the planning time reaches 13.942ms.
The plan:
postgres=# explain analyze
postgres-# select
postgres-# y.p1,
postgres-# (Select x.p2 from test1.test1hashtable x where y.p1 =x.p1 and
y.p2=x.p2) as b
postgres-# from test1.test1hashtable y where p1 = '1' and p2 = '1';
QUERY PLAN
------------------------------------------------------------------------------------------------------------------
Seq Scan on test1hashtable826 y (cost=0.00..13318.30 rows=1 width=64)
(actual time=0.004..0.047 rows=0 loops=1)
Filter: ((p1 = '1'::text) AND (p2 = '1'::text))
SubPlan 1
-> Append (cost=0.00..13305.00 rows=1000 width=32) (never executed)
-> Seq Scan on test1hashtable1 x_1 (cost=0.00..13.30 rows=1
width=32) (never executed)
Filter: ((y.p1 = p1) AND (y.p2 = p2))
-> Seq Scan on test1hashtable1000 x_1000 (cost=0.00..13.30
rows=1 width=32) (never executed)
Filter: ((y.p1 = p1) AND (y.p2 = p2))
Planning Time: 13.942 ms
Execution Time: 4.899 ms
(2006 rows)
This is a very worthwhile thing to do. In a relatively large business
system, a large number of partition tables and high concurrency are often
used. If the planning time is too long, this will greatly affect the
business.
regards,
Shawn.
Wenjing <wenjing(dot)zwj(at)alibaba-inc(dot)com> 于2021年8月17日周二 上午10:31写道:
>
>
> 2021年8月16日 17:15,Wenjing <wenjing(dot)zwj(at)alibaba-inc(dot)com> 写道:
>
> Hi Hackers,
>
> Recently, a issue has been bothering me, This is about conditional
> push-down in SQL.
> I use cases from regression testing as an example.
> I found that the conditions (B =1) can be pushed down into the
> subquery, However, it cannot be pushed down to sublink/subplan.
> If a sublink/subplan clause contains a partition table, it can be useful
> to get the conditions for pruning.
> So, is it worth pushing conditions to sublink/subplan?
> Anybody have any ideas?
>
>
> regards,
> Wenjing
>
>
> example:
> create table p (a int, b int, c int) partition by list (a);
> create table p1 partition of p for values in (1);
> create table p2 partition of p for values in (2);
> create table q (a int, b int, c int) partition by list (a);
> create table q1 partition of q for values in (1) partition by list (b);
> create table q11 partition of q1 for values in (1) partition by list (c);
> create table q111 partition of q11 for values in (1);
> create table q2 partition of q for values in (2) partition by list (b);
> create table q21 partition of q2 for values in (1);
> create table q22 partition of q2 for values in (2);
> insert into q22 values (2, 2, 3);
>
> Sorry, I messed up the structure of the table.
> It is should be:
> create table ab (a int not null, b int not null) partition by list (a);
> create table ab_a2 partition of ab for values in(2) partition by list (b);
> create table ab_a2_b1 partition of ab_a2 for values in (1);
> create table ab_a2_b2 partition of ab_a2 for values in (2);
> create table ab_a2_b3 partition of ab_a2 for values in (3);
> create table ab_a1 partition of ab for values in(1) partition by list (b);
> create table ab_a1_b1 partition of ab_a1 for values in (1);
> create table ab_a1_b2 partition of ab_a1 for values in (2);
> create table ab_a1_b3 partition of ab_a1 for values in (3);
> create table ab_a3 partition of ab for values in(3) partition by list (b);
> create table ab_a3_b1 partition of ab_a3 for values in (1);
> create table ab_a3_b2 partition of ab_a3 for values in (2);
> create table ab_a3_b3 partition of ab_a3 for values in (3);
>
>
>
>
> postgres-# explain (costs off)
> postgres-# select temp.b from
> postgres-# (
> postgres(# select a,b from ab x where x.a = 1
> postgres(# union all
> postgres(# (values(1,1))
> postgres(# ) temp,
> postgres-# ab y
> postgres-# where y.b = temp.b and y.a = 1 and y.b=1;
> QUERY PLAN
> ---------------------------------------------------
> Nested Loop
> -> Seq Scan on ab_a1_b1 y
> Filter: ((b = 1) AND (a = 1))
> -> Append
> -> Subquery Scan on "*SELECT* 1"
> -> Seq Scan on ab_a1_b1 x
> Filter: ((a = 1) AND (b = 1))
> -> Result
> (8 rows)
>
> The conditions (B =1) can be pushed down into the subquery.
>
> postgres=# explain (costs off)
> postgres-# select
> postgres-# y.a,
> postgres-# (Select x.b from ab x where y.a =x.a and y.b=x.b) as b
> postgres-# from ab y where a = 1 and b = 1;
> QUERY PLAN
> ---------------------------------------------------
> Seq Scan on ab_a1_b1 y
> Filter: ((a = 1) AND (b = 1))
> SubPlan 1
> -> Append
> -> Seq Scan on ab_a1_b1 x_1
> Filter: ((y.a = a) AND (y.b = b))
> -> Seq Scan on ab_a1_b2 x_2
> Filter: ((y.a = a) AND (y.b = b))
> -> Seq Scan on ab_a1_b3 x_3
> Filter: ((y.a = a) AND (y.b = b))
> -> Seq Scan on ab_a2_b1 x_4
> Filter: ((y.a = a) AND (y.b = b))
> -> Seq Scan on ab_a2_b2 x_5
> Filter: ((y.a = a) AND (y.b = b))
> -> Seq Scan on ab_a2_b3 x_6
> Filter: ((y.a = a) AND (y.b = b))
> -> Seq Scan on ab_a3_b1 x_7
> Filter: ((y.a = a) AND (y.b = b))
> -> Seq Scan on ab_a3_b2 x_8
> Filter: ((y.a = a) AND (y.b = b))
> -> Seq Scan on ab_a3_b3 x_9
> Filter: ((y.a = a) AND (y.b = b))
> (22 rows)
>
> The conditions (B = 1 and A = 1) cannot be pushed down to sublink/subplan
> in targetlist.
>
> postgres=# explain (costs off)
> postgres-# select y.a
> postgres-# from ab y
> postgres-# where
> postgres-# (select x.a > x.b from ab x where y.a =x.a and y.b=x.b) and
> postgres-# y.a = 1 and y.b = 1;
> QUERY PLAN
> ---------------------------------------------------
> Seq Scan on ab_a1_b1 y
> Filter: ((a = 1) AND (b = 1) AND (SubPlan 1))
> SubPlan 1
> -> Append
> -> Seq Scan on ab_a1_b1 x_1
> Filter: ((y.a = a) AND (y.b = b))
> -> Seq Scan on ab_a1_b2 x_2
> Filter: ((y.a = a) AND (y.b = b))
> -> Seq Scan on ab_a1_b3 x_3
> Filter: ((y.a = a) AND (y.b = b))
> -> Seq Scan on ab_a2_b1 x_4
> Filter: ((y.a = a) AND (y.b = b))
> -> Seq Scan on ab_a2_b2 x_5
> Filter: ((y.a = a) AND (y.b = b))
> -> Seq Scan on ab_a2_b3 x_6
> Filter: ((y.a = a) AND (y.b = b))
> -> Seq Scan on ab_a3_b1 x_7
> Filter: ((y.a = a) AND (y.b = b))
> -> Seq Scan on ab_a3_b2 x_8
> Filter: ((y.a = a) AND (y.b = b))
> -> Seq Scan on ab_a3_b3 x_9
> Filter: ((y.a = a) AND (y.b = b))
> (22 rows)
>
> The conditions (B=1 and A=1) cannot be pushed down to sublink/subplan in
> where clause.
>
>
>
>
>
From | Date | Subject | |
---|---|---|---|
Next Message | Kyotaro Horiguchi | 2021-09-01 03:12:25 | Re: Possible missing segments in archiving on standby |
Previous Message | Fujii Masao | 2021-09-01 02:34:34 | Re: prevent immature WAL streaming |