Re: generic plans and "initial" pruning

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Daniel Gustafsson <daniel(at)yesql(dot)se>, David Rowley <dgrowleyml(at)gmail(dot)com>, Jacob Champion <jchampion(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Thom Brown <thom(at)linux(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: generic plans and "initial" pruning
Date: 2024-08-15 15:34:51
Message-ID: CA+TgmoabYD=pnccFLzbbREFsqkFgE4EZ+FdHoTOhgCqn4jP2Cw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 15, 2024 at 8:57 AM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> TBH, it's more of a hunch that people who are not involved in this
> development might find the new reality, whereby the execution is not
> racefree until ExecutorRun(), hard to reason about.

I'm confused by what you mean here by "racefree". A race means
multiple sessions are doing stuff at the same time and the result
depends on who does what first, but the executor stuff is all
backend-private. Heavyweight locks are not backend-private, but those
would be taken in ExectorStart(), not ExecutorRun(), IIUC.

> With the patch, CreateQueryDesc() and ExecutorStart() are moved to
> PortalStart() so that QueryDescs including the PlanState trees for all
> queries are built before any is run. Why? So that if ExecutorStart()
> fails for any query in the list, we can simply throw out the QueryDesc
> and the PlanState trees of the previous queries (NOT run them) and ask
> plancache for a new CachedPlan for the list of queries. We don't have
> a way to ask plancache.c to replan only a given query in the list.

I agree that moving this from PortalRun() to PortalStart() seems like
a bad idea, especially in view of what you write below.

> * There's no longer CCI() between queries in PortalRunMulti() because
> the snapshots in each query's QueryDesc must have been adjusted to
> reflect the correct command counter. I've checked but can't really be
> sure if the value in the snapshot is all anyone ever uses if they want
> to know the current value of the command counter.

I don't think anything stops somebody wanting to look at the current
value of the command counter. I also don't think you can remove the
CommandCounterIncrement() calls between successive queries, because
then they won't see the effects of earlier calls. So this sounds
broken to me.

Also keep in mind that one of the queries could call a function which
does something that bumps the command counter again. I'm not sure if
that creates its own hazzard separate from the lack of CCIs, or
whether it's just another part of that same issue. But you can't
assume that each query's snapshot should have a command counter value
one more than the previous query.

While this all seems bad for the partially-initialized-execution-tree
approach, I wonder if you don't have problems here with the other
design, too. Let's say you've the multi-query case and there are 2
queries. The first one (Q1) is SELECT mysterious_function() and the
second one (Q2) is SELECT * FROM range_partitioned_table WHERE
key_column = 42. What if mysterious_function() performs DDL on
range_partitioned_table? I haven't tested this so maybe there are
things going on here that prevent trouble, but it seems like executing
Q1 can easily invalidate the plan for Q2. And then it seems like
you're basically back to the same problem.

> > > 3. The need to add *back* the fields to store the RT indexes of
> > > relations that are not looked at by ExecInitNode() traversal such as
> > > root partitioned tables and non-leaf partitions.
> >
> > I don't remember exactly why we removed those or what the benefit was,
> > so I'm not sure how big of a problem it is if we have to put them
> > back.
>
> We removed those in commit 52ed730d511b after commit f2343653f5b2
> removed redundant execution-time locking of non-leaf relations. So we
> removed them because we realized that execution time locking is
> unnecessary given that AcquireExecutorLocks() exists and now we want
> to add them back because we'd like to get rid of
> AcquireExecutorLocks(). :-)

My bias is to believe that getting rid of AcquireExecutorLocks() is
probably the right thing to do, but that's not a strongly-held
position and I could be totally wrong about it. The thing is, though,
that AcquireExecutorLocks() is fundamentally stupid, and it's hard to
see how it can ever be any smarter. If we want to make smarter
decisions about what to lock, it seems reasonable to me to think that
the locking code needs to be closer to code that can evaluate
expressions and prune partitions and stuff like that.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2024-08-15 16:00:32 Re: long-standing data loss bug in initial sync of logical replication
Previous Message Alexander Lakhin 2024-08-15 15:00:00 Re: Remove dependence on integer wrapping