Re: Sample rate added to pg_stat_statements

From: Sami Imseih <samimseih(at)gmail(dot)com>
To: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>
Cc: Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Greg Sabino Mullane <htamfids(at)gmail(dot)com>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Sample rate added to pg_stat_statements
Date: 2025-02-20 01:04:41
Message-ID: CAA5RZ0vOjQDC=OvUHuKMPYc8UT_peKt_yWpVaEqiP-JG2yqB9w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> But instead of blindly reducing the frequency via PRNG, we can take a more thoughtful approach with threshold by execute time:

> Find the most frequent query by column 'calls' in pg_stat_statements;
> In this query look at info about execution time: min_exec_time, max_exec_time, etc;
> Gradually increase the threshold from min_exec_time to max_exec_time, limiting the tracking of this query.
> Monitor performance: once the bottleneck is resolved, stop at the current threshold value.

This approach allows us to:

> Eliminate the spin-lock bottleneck;
> Preserve data about slow queries, which may be critical for performance analysis;
> Reduce the load on the most frequent queries causing contention, instead of uniformly reducing the frequency for all queries.

In my opinion, sample rate is a better fit for pg_stat_statements,
since the queries that
you care about the most are usually the most frequently executed. Sampling them
will still provide enough good data without the risk of not capturing
statistics about
them at all.

Longer running queries will also likely be the least frequent, so they
are already not likely
contributing to the spinlock contention. Also, the least frequent
queries will likely be aged
out faster, so pg_stat_statements was never really a good candidate to
track those anyways;
slow query logging with log_min_duration_statement is a better way to
ensure you capture
the data.

Maybe others may have a different opinion?

--

Sami

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2025-02-20 01:31:32 Re: Parallel heap vacuum
Previous Message Thomas Munro 2025-02-20 01:00:10 Re: GetRelationPath() vs critical sections