From: | Daniel Gustafsson <daniel(at)yesql(dot)se> |
---|---|
To: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
Cc: | Julien Rouhaud <rjuju123(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: proposal - log_full_scan |
Date: | 2021-07-06 14:07:33 |
Message-ID: | 3E8BD81C-07C9-46F2-98D3-2E4EEEFB82FD@yesql.se |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Looking at this I like the idea in principle, but I'm not convinced that
auto_explain is the right tool for this. auto_explain is for identifying slow
queries, and what you are proposing is to identify queries with a certain
"shape" (for lack of a better term) even if they aren't slow as per the
log_min_duration setting. If log_min_duration is deemed to crude due to query
volume then sample_rate is the tool. If sample_rate is also discarded, then
pg_stat_statements seems a better option.
Also, why just sequential scans (apart from it being this specific usecase)?
If the idea is to track aspects of execution which are deemed slow, then
tracking for example spills etc would be just as valid. Do you have thoughts
on that?
That being said, a few comments on the patch:
- (auto_explain_log_min_duration >= 0 && \
+ ((auto_explain_log_min_duration >= 0 || auto_explain_log_seqscan != -1) && \
Is there a reason to not follow the existing code and check for >= 0?
+ DefineCustomIntVariable("auto_explain.log_seqscan",
It's only a matter of time before another node is proposed for logging, and
then we'll be stuck adding log_XXXnode GUCs. Is there a more future-proof way
to do this?
+ "Sets the minimum tuples produced by sequantial scans which plans will be logged",
s/sequantial/sequential/
- es->analyze = (queryDesc->instrument_options && auto_explain_log_analyze);
+ es->analyze = (queryDesc->instrument_options && (auto_explain_log_analyze || auto_explain_log_seqscan != -1));
Turning on ANALYZE when log_analyze isn't set to True is a no-no IMO.
+ * Colllect relations where log_seqscan limit was exceeded
s/Colllect/Collect/
+ if (*relnames.data != '\0')
+ appendStringInfoString(&relnames, ",");
This should use appendStringInfoChar instead.
+ (errmsg("duration: %.3f ms, over limit seqscans: %s, plan:\n%s",
The "over limit" part is superfluous since it otherwise wouldn't be logged. If
we're prefixing something wouldn't it be more helpful to include the limit,
like: "seqscans >= %d tuples returned:". I'm not a fan of "seqscans" but
spelling it out is also quite verbose and this is grep-able.
Documentation and tests are also missing
--
Daniel Gustafsson https://vmware.com/
From | Date | Subject | |
---|---|---|---|
Next Message | Dipesh Pandit | 2021-07-06 14:38:42 | Re: .ready and .done files considered harmful |
Previous Message | Dilip Kumar | 2021-07-06 13:39:38 | Re: [PATCH] Use optimized single-datum tuplesort in ExecSort |