Re: [PROPOSAL] timestamp informations to pg_stat_statements

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: legrand legrand <legrand_legrand(at)hotmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [PROPOSAL] timestamp informations to pg_stat_statements
Date: 2018-03-10 20:12:11
Message-ID: 0fe51355-40c3-9d0d-0fc3-4404e0cd70e2@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 03/10/2018 04:43 PM, legrand legrand wrote:
> +1
> Having the time of first occurence of a statement is very usefull
> for trouble shouting, it permits for exemple to retrieve the order of
> operations in some complex cases (and thoses informations aren't
> taken by any third party collecting tool, that will only be able to
> provide a time range of occurence).
>

I really don't see how this would be useful in reconstructing order of
operations, particularly in complex cases where I'd expect the queries
to be executed repeatedly / interleaved in different ways. Furthermore,
it would only work for the very first execution of all statements, so
you would probably have to reset the stats over and over - which seems
to directly contradict the purpose of pg_stat_statements (aggregation of
data over longer periods of time).

So unfortunately this seems rather useless, and log_statements=all seems
like a much better / reliable approach to achieve that.

I also doubt it really allows computation of averages, e.g. queries per
second, because practical workloads usually have daily/weekly patterns
(and different queries may follow different patterns).

So I agree with Peter Geoghegan that tools regularly snapshotting
pg_stat_statements and processing that are a better approach. I've seen
a bunch of simple scripts doing just that, actually.

> I thougth that pgss rows where removed randomly when max rows was
> reached, wouldn't having last_executed information permit a kind of
> LRU removal ?
>

No, the entries are not removed randomly. We track "usage" for each
entry (essentially +1 for each time the query got executed, with a decay
factor applied on each eviction (and we evict 5% at a time).

It's not immediately obvious why something based on time of the
first/last execution would be better than the current algorithm.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-03-10 20:21:06 Re: Bogus use of canonicalize_qual
Previous Message Henry 2018-03-10 19:39:42 Re: [RFC] What would be difficult to make data models pluggable for making PostgreSQL a multi-model database?