From: | Rafael Thofehrn Castro <rafaelthca(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Proposal: Progressive explain |
Date: | 2025-03-26 00:52:03 |
Message-ID: | CAG0ozMpjEFThDGNqSoFTj1v_cpEs0rRQw5moVKU6Oy88oQ=Yhw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello Robert,
Fixed most of the recommendations. Going over one at a time.
> The documentation for the progressive_explain = { off | explain |
> analyze } option seems like it should go into more detail about how
> the "explain" and "analyze" values are different. I'm not 100% sure I
> know the answer, and I'm not the least-experienced person who will
> ever read this documentation.
Added details about behavior of each option. In the doc for that GUC
there is a link to another section that explains in detail how progressive
explains work.
> WrapMultiExecProcNodesWithExplain seems like a poor choice of name. It
> invites confusion with MultiExecProcNode, to which it is unrelated.
Renamed that function (and the unwrap equivalent) to
WrapMemberNodesExecProcNodesWithExplain
as MemberNodes is the name used by explain when parsing those types of
nodes.
> I do notice that WrapExecProcNodeWithExplain does not walk
> the ps->initPlan list, which I think is an oversight.
Fixed.
> I just went to some trouble to start breaking up the monolith that is
> explain.c, so I'm not that happy about seeing this patch dump another
> 800+ lines of source code into that file. Probably we should have a
> new source file for some or this, or maybe even more than one.
The whole progressive explain code was moved to a new set of files
explain_progressive.h and explain_progressive.c.
> The changes to explain.h add three new data types. Two of them begin
> with an uppercase letter and one with a lowercase letter. That seems
> inconsistent. I also don't think that the declaration of char plan[]
> is per PostgreSQL coding style. I believe we always write char
> plan[FLEXIBLE_ARRAY_MEMBER]. Also, maybe it should be named something
> other than plan, since it's really a string-ified explain-y kind of
> thing, not literally a Plan. Also, can we please not have structure
> members with single letter names? "h" and "p" are going to be
> completely ungreppable, and I like grepping
Done. Adjusted all data types to be uppercase. Added FLEXIBLE_ARRAY_MEMBER
and renamed "h" and "p".
> It looks very strange to me that ProgressiveExplainPrint() seems to
> have a theoretical possibility of generating the output and then
> throwing it away if we end up with entry == NULL. I guess maybe that
> case is not supposed to happen because ProgressiveExplainInit() is
> supposed to create the entry, but then why isn't this an elog(ERROR)
> or something instead of a no-op?
Changed to throw ERROR, which shouldn't happen.
> It seems like when we replace a longer entry with a shorter one, we
> forget that it was originally longer. Hence, if the length of a
> progressive EXPLAIN is alternately 2922 characters and 2923
> characters, we'll reallocate on every other progressive EXPLAIN
> instead of at most once.
Adjusted this logic. Structure ProgressiveExplainHashEntry now contains
a field to store the allocated size, which is used to compare with
the new plan being printed.
> Thanks for the pointer. I'm a bit skeptical about what's going on here
> in ProgressiveExplainReleaseFunc(). It seems like we're depending on
> shared memory to tell us whether we need to do purely backend-local
> cleanup, like calling disable_timeout() and resetting
> ProgressiveExplainPending and activeQueryDesc. I would have expected
> us to keep track in local memory of whether this kind of work needs to
> be done. It seems roundabout to take an LWLock, do a hash table lookup
> to see if there's an entry there, release the LWLock, and then very
> shortly thereafter take the lock a second time to release the entry
> that we now know is there.
> The comment in ProgressiveExplainCleanup about only detaching from the
> DSA if the query ended gracefully is not ideal from my point of view
> because it says what the code does instead of why the code does that
> thing. Also, the function is seemingly called with queryDesc as an
> argument not because you need it for anything but because you're going
> to test whether it's null. In that case, you could just pass a
> Boolean. Even then, something seems odd about this: why do we have to
> be holding ProgressiveExplainHashLock to dsa_detach the
> somewhat-inscrutably named area pe_a? And why are we not detaching it
> in case of error?
> I am wondering why you chose this relatively unusual error cleanup
> strategy. What I would have done is invent AtEOXact_ProgressiveExplain
> and AtSubEOXact_ProgressiveExplain. In some sense this looks simpler,
> because it doesn't need separate handling for transactions and
> subtransactions, but it's so unlike what we do in most places that
> it's hard for me to tell whether it's correct. I feel like the
> approach you've chosen here would make sense if what we wanted to do
> was basically release memory or some memory-like resource associated
> closely with the context -- e.g. expandedrecord.c releases a
> TupleDesc, but this is doing more than that.
> I think the effect of this choice is that cleanup of the
> progressive-EXPLAIN stuff happens much later than it normally would.
> Most of the time, in the case of an abort, we would want
> AbortTransaction() to release as many resources as possible, leaving
> basically no work to do at CleanupTransaction() time. This is so that
> if a user types "SELECT 1/0;" we release resources, as far as
> possible, right away, and don't wait for them to subsequently type
> "ROLLBACK;". The transaction lives on only as a shell. But these
> resources, if I'm reading this right, are going to stick around until
> the transaction is actually rolled back, because memory is not freed
> until CleanupTransaction() time. I wonder what happens if a query
> inside of an explicit transaction aborts after putting an entry in the
> progressive-explain view. My guess is that the entry will stick around
> until the actual rollback happens.
> In fact, now that I think about it, this is probably why we don't
> dsa_detach() in ProgressiveExplainCleanup() in cases of error -- the
> resource owner cleanup will have already released the DSA segments
> long before the memory context is deleted.
> I'm sort of inclined to say that this should be rewritten to do error
> cleanup in a more normal way. It's probably more code to do it that
> way, but I think having it be more similar to other subsystems is
> probably worth quite a bit.
Right. I use dsa_detach() only when the query finishes gracefully. This
is needed otherwise PG will complain with:
WARNING: resource was not closed: dynamic shared memory segment 32349774
That WARNING doesn't appear in case of error and according to my tests
shared memory is correctly being released.
I completely removed the MemoryContextCallback strategy and switched to
using the new function AtEOXact_ProgressiveExplain() called from
AbortTransaction().
This first version of the progressive explain feature was designed to only
keep
track of initial query called by the backend, ignoring all subquery calls.
So
I believe we don't need to worry about having to add custom logic in
AbortSubTransaction(). In case query errors out AbortTransaction() will be
called
and everything related to progressive explains will be cleaned.
> Another thing I just noticed is that pg_stat_progress_explain() uses
> beentry->st_procpid == ped->pid as the permissions check, but a more
> typical test is HAS_PGSTAT_PERMISSIONS(beentry->st_userid). I know
> that's only in pgstatfuncs.c, but I think it would be OK to duplicate
> the associated test here (i.e. has_privs_of_role(GetUserId(),
> ROLE_PG_READ_ALL_STATS) || has_privs_of_role(GetUserId(), role)). I
> don't really see a reason why this shouldn't use the same permission
> rules as other pg_stat_ things, in particular pg_stat_get_activity().
Adjusted the logic to use has_privs_of_role(GetUserId(),
> ROLE_PG_READ_ALL_STATS) || has_privs_of_role(GetUserId(), role) when
checking the privileges.
Rafael.
Attachment | Content-Type | Size |
---|---|---|
v11-0001-Proposal-for-progressive-explains.patch | application/octet-stream | 88.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Sami Imseih | 2025-03-26 00:56:29 | Re: query_id: jumble names of temp tables for better pg_stat_statement UX |
Previous Message | Noah Misch | 2025-03-26 00:43:28 | Re: AIO v2.5 |