Quick Links

Re: Less than ideal error reporting in pg_stat_statements

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	Peter Geoghegan <pg(at)heroku(dot)com>
Cc:	Jim Nasby <Jim(dot)Nasby(at)bluetreble(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Less than ideal error reporting in pg_stat_statements
Date:	2015-10-04 16:27:15
Message-ID:	32385.1443976035@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Peter Geoghegan <pg(at)heroku(dot)com> writes:
> Attached, revised patch deals with the issues around removing the
> query text file when garbage collection encounters trouble. There is
> no reason to be optimistic about any error within gc_qtexts() not
> recurring during a future garbage collection. OOM might be an
> exception, but probably not, since gc_qtexts() is reached when a new
> entry is created with a new query text, which in general makes OOM
> progressively more likely.

Hm. I'm unconvinced by the aspects of this that involve using
mean_query_len as a filter on which texts will be accepted. While that's
not necessarily bad in the abstract, there are way too many implementation
artifacts here that will result in random-seeming decisions about whether
to normalize. For instance:

* mean_query_len only gets recomputed in entry_dealloc(), which is only
run if we exceed pgss_max, and gc_qtexts(), which is only run if we decide
the query texts file is more than 50% bloat. So there could be quite a
long startup transient before the value gets off its initialization
minimum, and I'm suspicious that there might be plausible use-cases where
it never does. So it's not so much "restrict to a multiple of the mean
query len" as "restrict to some number that might once upon a time have
had some relation to the mean query len, or maybe not".

* One could expect that after changing mean_query_len, the population of
query texts would change character as a result of the filter behavior
changing, so that convergence to stable behavior over the long haul is
not exactly self-evident.

* As you've got it here, entry_dealloc() and gc_qtexts() don't compute
mean_query_len the same way, because only one of them discriminates
against sticky entries. So the value would bounce about rather randomly
based on which one had run last.

* I'm not exactly convinced that sticky entries should be ignored for
this purpose anyway.

Taking a step back, ISTM the real issue you're fighting here is lots of
orphaned sticky entries, but the patch doesn't do anything directly to fix
that problem. I wonder if we should do something like making
entry_dealloc() and/or gc_qtexts() aggressively remove sticky entries,
or at least those with "large" texts.

I think the aspects of this patch that are reasonably uncontroversial are
increasing the allowed malloc attempt size in gc_qtexts, flushing the
query text file on malloc failure, fixing the missing cleanup steps after
a gc failure, and making entry_dealloc's recomputation of mean_query_len
sane (which I'll define for the moment as "the same as gc_qtexts would
get"). Since we're hard against a release deadline, I propose to commit
just those changes, and we can consider the idea of a query size filter
and/or redefining mean_query_len at leisure.

regards, tom lane

In response to

Re: Less than ideal error reporting in pg_stat_statements at 2015-10-03 18:37:35 from Peter Geoghegan

Responses

Re: Less than ideal error reporting in pg_stat_statements at 2015-10-04 19:41:40 from Peter Geoghegan

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tom Lane	2015-10-04 16:36:19	Re: Odd query execution behavior with extended protocol
Previous Message	Pavel Stehule	2015-10-04 16:02:19	Re: check fails on Fedora 23