Re: Proposal: Progressive explain

From: Rafael Thofehrn Castro <rafaelthca(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andrei Lepikhov <lepihov(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Proposal: Progressive explain
Date: 2025-03-29 17:51:20
Message-ID: CAG0ozMpug5icxk8Mh_uiGc3OX3zsx67RsRA9Q-o+3ZXn-7M9mQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> (2) Merge progressive_explain = explain and progressive_explain =
> analyze into a single value, progressive_explain = on. To tell which
> is intended, just check progresive_explain_interval. If it's zero,
> then there's no repeat, so we mean what you currently call
> progressive_explain = explain i.e. serialize once. Otherwise we mean
> progressive_explain = analyze.

Implemented this version. New patch has the following characteristics:

- progressive_explain is a boolean

- GUC progressive_explain_min_duration is removed

- if progresive_explain_interval = 0, update plan in memory only at query
start

- if progresive_explain_interval > 0, share the plan in memory at query
start and update every progresive_explain_interval (instrumentation is
enabled automatically)

- The plan shared in pg_stat_progress_explain at query start does not
contain any instrumentation detail ((never executed), (actual time), etc)
if progresive_explain_interval = 0, even if query is started with EXPLAIN
ANALYZE. My reasoning here is that if the plan will be shared only once we
are actually interested in seeing the plan itself and not instrumentation
progress.

Fixed other comments you shared in previous emails too:

> void
> AtEOSubXact_ProgressiveExplain(bool isCommit, int nestDepth)
> {
> if (activeQueryDesc != NULL &&
> activeQueryXactNestLevel >= nestDepth)
> {
> if (isCommit)
> elog(WARNING, "leaked progressive explain query descriptor");
> ProgressiveExplainCleanup(NULL);
> }
>}
> By including the is-commit case in there, we can catch any bugs where
> things aren't cleaned up properly before a transaction is committed.

Added the isCommit logic both to Transaction and Subtransaction commits
and aborts that will notify about leakage if cleanup was not properly
done in a commit. Changed function names back to AtEOXact and AtEOSubXact,
as opposed to AtAbort and AtSubAbort.

> We generally want to test >= nestDepth instead of == nestDepth in case
> multiple subtransaction levels abort all at once; I'm not sure it
> matters here, but even if it isn't, it's best to be consistent with
> the practice elsewhere. Having {Commit,Abort}SubTransaction pass the
> nestDepth instead of calling GetCurrentTransactionNestLevel() also has
> precedent e.g. see AtEOSubXact_HashTables.

Done.

> As a further refinement, consider initializing
> activeQueryXactNestLevel to -1 and resetting it to that value each
> time you end a progressive EXPLAIN, so that activeQueryDesc != NULL if
> and only if activeQueryXactNestLevel >= 0. Then the test above can be
> simplified to just if (activeQueryXactNestLevel >= nestDepth).

Done.

> standard_ExecutorFinish() makes its call to ProgressiveExplainFinish()
> dependent on the value of the progressive_explain GUC, but that GUC
> could be changed in mid-query via set_config() or a plpgsql function
> that calls SET, which could result in skipping the cleanup even when
> it's needed. I think you should make the call unconditional and make
> it entirely the job of ProgressiveExplainFinish() to decide whether
> any cleanup is needed.

Done.

> ProgressiveExplainFinish() calls ProgressiveExplainCleanup() in most
> cases, but sometimes just disables the timeout instead. I think this
> is weird. I think you should just always call
> ProgressiveExplainCleanup() and make sure it's robust and cleans up
> however much or little is appropriate in all cases.

Now that I removed all the execProcNode wrapping and the conditional cleanup
based on progressive_explain_min_duration (that got removed), this
part became much simpler. ProgressiveExplainFinish always calls
ProgressiveExplainCleanup.

> On the flip side, I can't really see why
> dsa_detach(queryDesc->pestate->pe_a) needs to be done while holding
> ProgressiveExplainHashLock. Why not just have
> ProgressiveExplainFinish() call ProgressiveExplainCleanup(), and then
> afterwards it can do the dsa_detach() in the caller? Then
> ProgressiveExplainCleanup() no longer needs an argument.

That is perfect. Implemented.

> ProgressiveExplainPrint() can save a level of indentation in a large
> chunk of the function by understanding that elog(ERROR) does not
> return. You don't need to wrap everything that follows in else {}.

Fixed.

I will not fix documentation for now as we are not done with
implementation changes yet. Once we agree that the code logic is
safe and sound we can discuss cosmetics (feature name, GUCs, view, etc).

Rafael.

Attachment Content-Type Size
v14-0001-Proposal-for-progressive-explains.patch application/octet-stream 83.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2025-03-29 18:09:09 Re: pg_stat_database.checksum_failures vs shared relations
Previous Message Peter Eisentraut 2025-03-29 17:24:26 Re: 64 bit numbers vs format strings