Re: [HACKERS] Runtime Partition Pruning

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Jesper Pedersen <jesper(dot)pedersen(at)redhat(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, Beena Emerson <memissemerson(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>, amul sul <sulamul(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Subject: Re: [HACKERS] Runtime Partition Pruning
Date: 2018-04-05 13:41:30
Message-ID: CAKJS1f8KDowMcMdff9+gr5pTzsrVucA+xeCH5ck05bAzCTXNMg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Amit,

Thanks for having a look at this.

On 6 April 2018 at 00:54, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> I looked at it and here are my thoughts on it after having for the first
> time looked very closely at it.
>
> * Regarding PartitionPruneInfo:
>
> I think the names of the fields could be improved -- like pruning_steps
> instead prunesteps, unpruned_parts instead of allpartindexs. The latter
> is even a bit misleading because it doesn't in fact contain *all*
> partition indexes, only those that are unpruned, because each either has a
> subpath or it appears in (unpruned) partitioned_rels list. Also, I didn't
> like the name subnodeindex and subpartindex very much. subplan_indexes
> and parent_indexes would sound more informative to me.

Seems mostly fair. I'm not a fan of using the term "unpruned" though.
I'll have a think. The "all" is meant in terms of what exists as
subnodes.

subplan_indexes and parent_indexes seem like better names, I agree.

> * make_partition_pruneinfo has a parameter resultRelations that's not used
> anywhere

It gets used in 0005.

I guess I could only add it in 0005, but make_partition_pruneinfo is
only used in 0003, so you could say the same about that entire
function.

Do you think I should delay adding that parameter until the 0005 patch?

> * In make_partition_pruneinfo()
>
> allsubnodeindex = palloc(sizeof(int) * root->simple_rel_array_size);
> allsubpartindex = palloc(sizeof(int) * root->simple_rel_array_size);
>
> I think these arrays need to have root->simple_rel_array_size + 1
> elements, because they're subscripted using RT indexes which start at 1.

RT indexes are always 1-based. See setup_simple_rel_arrays. It already
sets the array size to list_length(rtable) + 1.

> * Also in make_partition_pruneinfo()
>
> /* Initialize to -1 to indicate the rel was not found */
> for (i = 0; i < root->simple_rel_array_size; i++)
> {
> allsubnodeindex[i] = -1;
> allsubpartindex[i] = -1;
> }
>
> Maybe, allocate the arrays above mentioned using palloc0 and don't do this
> initialization. Instead make the indexes that are stored in these start
> with 1 and consider 0 as invalid entries.

0 is a valid subplan index. It is possible to make this happen, but
I'd need to subtract 1 everywhere I used the map. That does not seem
very nice. Seems more likely to result in bugs where we might forget
to do the - 1.

Did you want this because you'd rather have the palloc0() than the for
loop setting the array elements to -1? Or is there another reason?

> * Regarding the code added in execPartition.c and execPartition.h:
>
> I wondered why PartitionedRelPruning is named the way it is. I saw many
> parallels with PartitionDispatchData both in terms of the main thing it
> consists of, that is, the map that translates partition indexes as in
> partition descriptor to that of subplans or of some other executor
> structure. Also, I imagine you tried to mimic PartitionTupleRouting with
> PartitionPruning but not throughout. For example, tuple routing struct
> pointer variables are throughout called proute, whereas PartitionPruning
> ones are called partprune instead of, say, pprune. Consistency would
> help, imho.

Yes, I saw similarities and did end up moving all the code into
execPartition a while back.

I'll look into this renaming.

> * Instead of nesting PartitionedRelPruning inside another, just store them
> in a global flat array in the PartitionPruning, like PartitionTupleRouting
> does for PartitionDispatch of individual partitioned tables in the tree.
>
> typedef struct PartitionedRelPruning
> {
> int nparts;
> int *subnodeindex;
> struct PartitionedRelPruning **subpartprune;

There is a flat array in PartitionPruning. subpartprune contains
pointers into that array. I want to have this pointer array so I can
directly reference the flat array in PartitionPruning.

Maybe I've misunderstood what you mean here.

> * I don't see why there needs to be nparts in the above, because it
> already has a PartitionPruneContext member which has that information.

Good point. I'll remove that.

> In fact, I made most of changes myself while going through the code.
> Please see attached the delta patch. It also tweaks quite a few other
> things including various comments. I think parts of it apply to 0001,
> 0003, and 0004 patches. See if this looks good to you.

Thanks. I'll look.

It's late over this side now, so will look tomorrow.

Thanks again for reviewing this.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2018-04-05 13:42:37 Re: Excessive PostmasterIsAlive calls slow down WAL redo
Previous Message Dent John 2018-04-05 13:41:15 Query Rewrite for Materialized Views (FDW Extension)