From: | Amit Langote <amitlangote09(at)gmail(dot)com> |
---|---|
To: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
Cc: | Daniel Gustafsson <daniel(at)yesql(dot)se>, Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: partition routing layering in nodeModifyTable.c |
Date: | 2020-10-13 04:32:29 |
Message-ID: | CA+HiwqENbJMzoD1VAoNWFLcnam0zJS34Vy0o5AuhW-UVq1LV6A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Oct 13, 2020 at 1:57 AM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> On 12/10/2020 16:47, Amit Langote wrote:
> > On Mon, Oct 12, 2020 at 8:12 PM Heikki Linnakangas <hlinnaka(at)iki(dot)fi> wrote:
> >> 1. We have many different cleanup/close routines now:
> >> ExecCloseResultRelations, ExecCloseRangeTableRelations, and
> >> ExecCleanUpTriggerState. Do we need them all? It seems to me that we
> >> could merge ExecCloseRangeTableRelations() and
> >> ExecCleanUpTriggerState(), they seem to do roughly the same thing: close
> >> relations that were opened for ResultRelInfos. They are always called
> >> together, except in afterTriggerInvokeEvents(). And in
> >> afterTriggerInvokeEvents() too, there would be no harm in doing both,
> >> even though we know there aren't any entries in the es_result_relations
> >> array at that point.
> >
> > Hmm, I find trigger result relations to behave differently enough to
> > deserve a separate function. For example, unlike plan-specified
> > result relations, they don't point to range table relations and don't
> > open indices. Maybe the name could be revisited, say,
> > ExecCloseTriggerResultRelations().
>
> Matter of perception I guess. I still prefer to club them together into
> one Close call. It's true that they're slightly different, but they're
> also pretty similar. And IMHO they're more similar than different.
Okay, fine with me.
> > Also, maybe call the other functions:
> >
> > ExecInitPlanResultRelationsArray()
> > ExecInitPlanResultRelation()
> > ExecClosePlanResultRelations()
> >
> > Thoughts?
>
> Hmm. How about initializing the array lazily, on the first
> ExecInitPlanResultRelation() call? It's not performance critical, and
> that way there's one fewer initialization function that you need to
> remember to call.
Agree that's better.
> It occurred to me that if we do that (initialize the array lazily),
> there's very little need for the PlannedStmt->resultRelations list
> anymore. It's only used in ExecRelationIsTargetRelation(), but if we
> assume that ExecRelationIsTargetRelation() is only called after InitPlan
> has initialized the result relation for the relation, it can easily
> check es_result_relations instead. I think that's a safe assumption.
> ExecRelationIsTargetRelation() is only used in FDWs, and I believe the
> FDWs initialization routine can only be called after ExecInitModifyTable
> has been called on the relation.
>
> The PlannedStmt->rootResultRelations field is even more useless.
I am very much tempted to remove those fields from PlannedStmt,
although I am concerned that the following now assumes that *all*
result relations are initialized in the executor initialization phase:
bool
ExecRelationIsTargetRelation(EState *estate, Index scanrelid)
{
if (!estate->es_result_relations)
return false;
return estate->es_result_relations[scanrelid - 1] != NULL;
}
In the other thread [1], I am proposing that we initialize result
relations lazily, but the above will be a blocker to that.
> > Actually, maybe we don't need to be so paranoid about setting up
> > es_result_relations in worker.c, because none of the downstream
> > functionality invoked seems to rely on it, that is, no need to call
> > ExecInitResultRelationsArray() and ExecInitResultRelation().
> > ExecSimpleRelation* and downstream functionality assume a
> > single-relation operation and the ResultRelInfo is explicitly passed.
>
> Hmm, yeah, I like that. Similarly in ExecuteTruncateGuts(), there isn't
> actually any need to put the ResultRelInfos in the es_result_relations
> array.
>
> Putting all this together, I ended up with the attached. It doesn't
> include the subsequent commits in this patch set yet, for removal of
> es_result_relation_info et al.
Thanks.
+ * We put the ResultRelInfos in the es_opened_result_relations list, even
+ * though we don't have a range table and don't populate the
+ * es_result_relations array. That's a big bogus, but it's enough to make
+ * ExecGetTriggerResultRel() find them.
*/
estate = CreateExecutorState();
resultRelInfos = (ResultRelInfo *)
palloc(list_length(rels) * sizeof(ResultRelInfo));
resultRelInfo = resultRelInfos;
+ estate->es_result_relations = (ResultRelInfo **)
+ palloc(list_length(rels) * sizeof(ResultRelInfo *));
Maybe don't allocate es_result_relations here?
+/*
+ * Close all relations opened by ExecGetRangeTableRelation()
+ */
+void
+ExecCloseRangeTableRelations(EState *estate)
+{
+ int i;
+
+ for (i = 0; i < estate->es_range_table_size; i++)
{
if (estate->es_relations[i])
table_close(estate->es_relations[i], NoLock);
}
I think we have an optimization opportunity here (maybe as a separate
patch). Why don't we introduce es_opened_relations? That way, if
only a single or few of potentially 1000s relations in the range table
is/are opened, we don't needlessly loop over *all* relations here.
That can happen, for example, with a query where no partitions could
be pruned at planning time, so the range table contains all
partitions, but only one or few are accessed during execution and the
rest run-time pruned. Although, in the workloads where it would
matter, other overheads easily mask the overhead of this loop; see the
first message at the linked thread [1], so it is hard to show an
immediate benefit from this.
Anyway, other than my concern about ExecRelationIsTargetRelation()
mentioned above, I think the patch looks good.
--
Amit Langote
EDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2020-10-13 04:51:18 | Re: Resetting spilled txn statistics in pg_stat_replication |
Previous Message | Amit Kapila | 2020-10-13 04:24:17 | Re: Resetting spilled txn statistics in pg_stat_replication |