From: | David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> |
---|---|
To: | Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> |
Cc: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Beena Emerson <memissemerson(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [HACKERS] path toward faster partition pruning |
Date: | 2018-01-10 04:18:58 |
Message-ID: | CAKJS1f_UBRUtmX29r-3ayysGAT3PGcs-z9DQDqGhuat87dbFrg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 9 January 2018 at 21:40, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> Sorry about the absence in the last few days. I will post a new version
> addressing various review comments by the end of this week.
I'm sorry for the flood of emails today.
Beena's tests on the run-time partition pruning patch also indirectly
exposed a problem with this patch.
Basically, the changes to add_paths_to_append_rel() are causing
duplication in partition_rels.
A test case is:
create table part (a int, b int) partition by list(a);
create table part1 partition of part for values in(1) partition by list (b);
create table part2 partition of part1 for values in(1);
select * from part;
partition_rels ends up with 3 items in the list, but there's only 2
partitions here. The reason for this is that, since planning here is
recursively calling add_paths_to_append_rel, the list for part ends up
with itself and part1 in it, then since part1's list already contains
itself, per set_append_rel_size's "rel->live_partitioned_rels =
list_make1_int(rti);", then part1 ends up in the list twice.
It would be nicer if you could use a RelIds for this, but you'd also
need some way to store the target partition relation since
nodeModifyTable.c does:
/* The root table RT index is at the head of the partitioned_rels list */
if (node->partitioned_rels)
{
Index root_rti;
Oid root_oid;
root_rti = linitial_int(node->partitioned_rels);
root_oid = getrelid(root_rti, estate->es_range_table);
rel = heap_open(root_oid, NoLock); /* locked by InitPlan */
}
You could also fix it by instead of doing:
/*
* Accumulate the live partitioned children of this child, if it's
* itself partitioned rel.
*/
if (childrel->part_scheme)
partitioned_rels = list_concat(partitioned_rels,
childrel->live_partitioned_rels);
do something along the lines of:
if (childrel->part_scheme)
{
ListCell *lc;
ListCell *start = lnext(list_head(childrel->live_partitioned_rels));
for_each_cell(lc, start)
partitioned_rels = lappend_int(partitioned_rels,
lfirst_int(lc));
}
Although it seems pretty fragile. It would probably be better to find
a nicer way of handling all this.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Davis | 2018-01-10 04:24:48 | Re: Rangejoin rebased |
Previous Message | Pavel Stehule | 2018-01-10 04:14:24 | Re: [HACKERS] SQL/JSON in PostgreSQL |