| From: | Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> | 
|---|---|
| To: | jesper(dot)pedersen(at)redhat(dot)com | 
| Cc: | Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Beena Emerson <memissemerson(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Re: [HACKERS] path toward faster partition pruning | 
| Date: | 2017-11-24 05:00:22 | 
| Message-ID: | 60280612-9319-fbb4-e71d-d023b5a7ab61@lab.ntt.co.jp | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
Thanks Jesper.
On 2017/11/23 3:56, Jesper Pedersen wrote:
> Hi Amit,
> 
> On 11/22/2017 03:59 AM, Amit Langote wrote:
>> Fixed in the attached.  No other changes beside that.
>>
> 
> I have been using the following script to look at the patch
> 
> -- test.sql --
[ ... ]
> EXPLAIN (ANALYZE) SELECT t1.a, t1.b, t2.c, t2.d FROM t1 INNER JOIN t2 ON
> t2.c = t1.b WHERE t2.d = 1;
> 
> I just wanted to highlight that the "JOIN ON" partition isn't pruned - the
> "WHERE" one is.
Did you mean to write ON t2.d = t1.b?  If so, equivalence class mechanism
will give rise to a t1.b = 1 and hence help prune t1's partition as well:
EXPLAIN (COSTS OFF)
SELECT t1.a, t1.b, t2.c, t2.d
FROM t1 INNER JOIN t2 ON t2.d = t1.b
WHERE t2.d = 1;
                        QUERY PLAN
-----------------------------------------------------------
 Nested Loop
   ->  Append
         ->  Bitmap Heap Scan on t1_p00
               Recheck Cond: (b = 1)
               ->  Bitmap Index Scan on idx_t1_b_a_p00
                     Index Cond: (b = 1)
   ->  Materialize
         ->  Append
               ->  Bitmap Heap Scan on t2_p00
                     Recheck Cond: (d = 1)
                     ->  Bitmap Index Scan on idx_t2_d_p00
                           Index Cond: (d = 1)
In your original query, you use ON t2.c = t1.b, whereby there is no
"constant" value to perform partition pruning with.  t2.c is unknown until
the join actually executes.
> BEGIN;
> EXPLAIN (ANALYZE) UPDATE t1 SET a = 1 WHERE b = 1;
> ROLLBACK;
>
> BEGIN;
> EXPLAIN (ANALYZE) DELETE FROM t1 WHERE b = 1;
> ROLLBACK;
> 
> Should pruning of partitions for UPDATEs (where the partition key isn't
> updated) and DELETEs be added to the TODO list?
Note that partition pruning *does* work for UPDATE and DELETE, but only if
you use list/range partitioning.  The reason it doesn't work in this case
(t1 is hash partitioned) is that the pruning is still based on constraint
exclusion in the UPDATE/DELETE case and constraint exclusion cannot handle
hash partitioning.
See example below that uses list partitioning:
drop table t1, t2;
create table t1 (a int, b int) partition by list (b);
create table t1_p0 partition of t1 for values in (0);
create table t1_p1 partition of t1 for values in (1);
create table t1_p2 partition of t1 for values in (2);
create table t1_p3 partition of t1 for values in (3);
create table t2 (c int, d int) partition by list (d);
create table t2_p0 partition of t2 for values in (0);
create table t2_p1 partition of t2 for values in (1);
create table t2_p2 partition of t2 for values in (2);
create table t2_p3 partition of t2 for values in (3);
explain (costs off) update t1 set a = 1 where b = 1;
       QUERY PLAN
=------------------------
 Update on t1
   Update on t1_p1
   ->  Seq Scan on t1_p1
         Filter: (b = 1)
(4 rows)
explain (costs off) delete from t1 where b = 1;
       QUERY PLAN
=------------------------
 Delete on t1
   Delete on t1_p1
   ->  Seq Scan on t1_p1
         Filter: (b = 1)
(4 rows)
I can see how that seems a bit odd.  If you use hash partitioning,
UPDATE/DELETE do not benefit from partition-pruning, even though SELECT
does.  That's because SELECT uses the new partition-pruning method (this
patch set) which supports hash partitioning, whereas UPDATE and DELETE use
constraint exclusion which doesn't.  It would be a good idea to make even
UPDATE and DELETE use the new method thus bringing everyone on the same
page, but that requires us to make some pretty non-trivial changes to how
UPDATE/DELETE planning works for inheritance/partitioned tables, which we
should undertake separately, imho.
Thanks,
Amit
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Amit Langote | 2017-11-24 05:22:18 | Re: [HACKERS] UPDATE of partition key | 
| Previous Message | Ashutosh Bapat | 2017-11-24 04:57:26 | Re: [HACKERS] Partition-wise aggregation/grouping |