Re: RFC: Logging plan of the running query

From: Atsushi Torikoshi <torikoshia(dot)tech(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, samimseih(at)gmail(dot)com
Cc: torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, destrex271(at)gmail(dot)com
Subject: Re: RFC: Logging plan of the running query
Date: 2025-04-05 06:13:47
Message-ID: CAM6-o=BecZ-FOOHZXLhMU=JGpGQBPS5cA=ASoLUr92fFgjvQqg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 3, 2025 at 11:10 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> Looking at ExplainAssembleLogOutput() is making me realize that
> auto_explain is in serious need of some cleanup. That's not really the
> fault of this patch, but the hack whereby we overwrite the [] that
> would have surrounded the JSON output with {} is not very nice. I also
> think that the auto_explain GUCs need rethinking. In theory, new
> EXPLAIN options should be mirrored into auto_explain, but if you
> compare ExplainAssembleLogOutput() to ExplainOnePlan(), you can see
> that they are diverging. The PLANNING, SUMMARY, and SERIALIZE options
> that are known to regular EXPLAIN aren't known to auto_explain, and
> any customizable options that use explain_per_plan_hook won't be able
> to work with auto_explain, either. Changing this is non-trivial
> because SERIALIZE, for example, can't work the same way for
> auto_explain as it does for EXPLAIN, and a full solution might also
> require user-visible changes like replacing
> auto_explain.log_<whatever> with auto_explain.explain, so I don't
> really know. Maybe we should just live with it the way it is for now,
> but it doesn't look very nice.

I agree that the current state isn’t ideal, and that addressing it wouldn’t
be trivial.
While I do think it would be better to tackle this before proceeding with
the current patch, I don’t think it's a prerequisite.
Also, since I’m not yet sure what the best way to fix it would be, I’ve
left it as-is for now.

> I don't think it's a good idea to put the logic to update
> ActiveQueryDesc into ExecutorRun. I think that function should really
> just call the hook, or standard_ExecutorRun. I don't know for sure
> whether this work should be moved down into ExecutorRun or up into the
> caller, but I don't think it should stay where it is.

Moved the logic from ExecutorRun() to standard_ExecutorRun(), since it
performs the necessary setup for collecting information during plan
execution i.e. calling InstrStartNode().

> My comment about the naming of WrapMultiExecProcNodesWithExplain() on
> the other thread also applies here: MultiExecProcNode() is an
> unrelated function.

Renamed WrapMultiExecProcNodesWithExplain to WrapPlanStatesWithExplain.

> Likewise, WrapExecProcNodeWithExplain() misses
> walking the node's subplan list.

Added logic for subplan to WrapExecProcNodeWithExplain() and
UnwrapExecProcNodeWithExplain().

> Also, I don't think an
> ereport(DEBUG1, ...) is appropriate here.

Removed these ereport(DEBUG1).

> Do we really need es->signaled? Couldn't we test es->analyze instead?

I think it's necessary.
Although when implementing progressive EXPLAIN, I believe we can use
es->analyze instead, this patch aims to retrieve plans for queries that
have neither an EXPLAIN nor an ANALYZE specified.

> Do we really need ExecProcNodeOriginal? Can we find some way to reuse
> ExecProcNodeReal instead of making the structure bigger?

I also wanted to implement this without adding elements to PlanState if
possible, but I haven't found a good solution, so the patch uses
ExecSetExecProcNode.

> I do think we should try to move some of this new code out into a
> separate source file, but I'm not yet sure what we should call it. We
> might want to share infrastructure with something what Rafael's patch,
> which he called progressive EXPLAIN but which is really closer to a
> general query progress facility, albeit perhaps too expensive to have
> enabled by default.

Made new files dynamic_explain.h and dynamic_explain.c, which you named and
it seems fine to me, and move most of the code to them.

> Does the documentation typically mention when a function is
> superuser-only? If so, it should do so here, too, using similar
> wording.

Added below explanation like other functions:

This function is restricted to superusers by default, but other
users can be granted EXECUTE to run the function.

> It seems unnecessary to remark that you can't log a query plan after a
> subtransaction abort, because the query wouldn't be running any more.

Removed the description.

> It's also true that you can't log a query after a toplevel abort, even
> if you're still waiting for the aborted transaction to roll back. But
> it also seems like once the aborted subtransaction is popped off the
> stack and we return to the parent transaction, we should be able to
> log any query running at the outer level.

It works as below:

session-1
begin;
savepoint s1;
err;
ERROR: syntax error at or near "err"
rollback to s1;
select pg_sleep(5);

session-2
select pg_log_query_plan(pid of session-1);

log
LOG: query plan running on backend with PID 40025 is:
Query Text: select pg_sleep(5);
Result (cost=0.00..0.01 rows=1 width=4)
Output: pg_sleep('5'::double precision)

> Does ActiveQueryDesc really need to be exposed to the whole system?
> Could it be a file-level variable?

On Thu, Apr 3, 2025 at 11:10 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> I think the question of whether ActiveQueryDesc can be file-level is
> separate from whether prior configuration is needed. If it is
> important to touch this from multiple source files, then it is fine
> for it to be global. However, if we have a new source file, say
> dynamic_explain.c, then you could have functions
> ProcessDynamicExplainInterrupt() and DynamicExplainCleanup() in that
> file to set, use, clear ActiveQueryDesc, and the rest of the system
> might not need to know about it.

Thanks for the advice, I took this approach and made ActiveQueryDesc
file-level variable.

On Thu, Apr 3, 2025 at 1:23 AM Sami Imseih <samimseih(at)gmail(dot)com> wrote:
> 7/
> Why do we only support TEXT format? I tried it with JSON
> and it looks fine in the log as well. I can imagine automated
> tools will want to be able to retrieve the plans using the
> structured formats.

I agree that allowing the choice of format would be useful.
However, I believe implementing this would require introducing new GUC or
creating a mechanism to pass the format from the executor of
pg_log_query_plan() to the target process.
For now, I think it would be better to minimize the scope of the initial
patch.

I think I have reflected your other comments in the attached patch.

Regards,
Atsushi Torikoshi

Attachment Content-Type Size
v43-0001-Add-function-to-log-the-plan-of-the-currently-ru.patch application/octet-stream 35.5 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message jian he 2025-04-05 07:29:55 Re: Support NOT VALID / VALIDATE constraint options for named NOT NULL constraints
Previous Message Konstantin Knizhnik 2025-04-05 06:02:52 Re: New criteria for autovacuum