From: | Greg Sabino Mullane <htamfids(at)gmail(dot)com> |
---|---|
To: | Rafael Thofehrn Castro <rafaelthca(at)gmail(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Proposal: Progressive explain |
Date: | 2024-12-30 16:37:14 |
Message-ID: | CAKAnmm+PrrEewDktzDFD+v-PRTmQbfjOnJksKvCTtrb9rkGOHA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Dec 29, 2024 at 8:19 PM Rafael Thofehrn Castro <rafaelthca(at)gmail(dot)com>
wrote:
> Plans are only printed if the new GUC parameter progressive_explain is
> enabled.
>
Maybe track_explain instead? In the spirit of track_activity.
- progressive_explain_output_size: max output size of the plan printed in
> the in-memory hash table.
> - default: 4096
> - min: 100
>
4096 seems low, if this means the explain plan is truncated at that size.
Also, the 100 minimum seems arbitrary.
So we can enable verbose and settings - but not wal? I could see that one
being useful. Not so much the rest (timing, summary). And buffers has
recently changed, so no need to worry about that. :)
> - The plans are stored in a shared hash object (explainArray) allocated at
> database start, similar to procArray. ExplainHashShmemSize() computes
> shared memory needed for it, based on max_connections +
> max_parallel_workers for the amount of elements in the array and
> progressive_explain_output_size for the size per element.
>
Hmmm...don't have a solution/suggestion offhand, but using max_connections
would seem to be allocating a chunk of memory that is never used 99% of the
time, as most people don't run active queries near max_connections.
(Actually, on re-reading my draft, I would prefer a rotating pool like
pg_stat_statements does.)
- Column explain from pg_stat_progress_explain can only be visualized by
> superusers or the same role that is running the query. If none of those
> conditions are met, users will see "<insufficient privilege>".
>
Or pg_read_all_stats I presume? Are those other columns (e.g.
explain_count) being visible to anyone really useful, or can we throw them
all behind the same permission restriction?
- From (B) we see that using progressive explain slightly increases total
> execution time.
>
Is this using the default dirt-simple pgbench queries? What about queries
that generate very large explain plans?
- Do the columns in pg_stat_progress_explain make sense? Are we missing or
> adding unnecessary columns?
>
Perhaps having the interval and sample rate in here as well, since they are
user-level and thus could be different from other rows in the view. It is
tempting to throw in other things as well like the query_start and datname,
but we don't want to copy all of pg_stat_activity...
It's not clear if total_explain_time is now() - query_start or something
else. If not, I would love to see an elapsed time interval column.
Perhaps add a leader_pid column. That's something I would always be joining
with pg_stat_activity to find out.
- Do we want progressive explain to print plans of regular queries started
> without EXPLAIN if progressive_explain is enabled or should
> the feature be restricted to instrumented queries (EXPLAIN ANALYZE)?
>
The latter, but no strong opinion.
id="guc-progressive-explain"
The new docs should mention the view name here, IMO, in addition to the
existing link that has details.
Random idea: track_explain_min_duration
Looks very cool overall, +1.
Cheers,
Greg
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2024-12-30 16:39:24 | Re: RFC: Allow EXPLAIN to Output Page Fault Information |
Previous Message | Tom Lane | 2024-12-30 16:33:39 | Re: IANA timezone abbreviations versus timezone_abbreviations |