Sample rate added to pg_stat_statements

From: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>
To: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Sample rate added to pg_stat_statements
Date: 2024-11-18 18:33:16
Message-ID: fe99e0ca-e564-480e-b865-5f0cee30bc60@tantorlabs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi hackers,

Under high-load scenarios with a significant number of transactions per
second, pg_stat_statements introduces substantial overhead due to the
collection and storage of statistics. Currently, we are sometimes forced
to disable pg_stat_statements or adjust the size of the statistics using
pg_stat_statements.max, which is not always optimal. One potential
solution to this issue could be query sampling in pg_stat_statements.

A similar approach has been implemented in extensions like auto_explain
and pg_store_plans, and it has proven very useful in high-load systems.
However, this approach has its trade-offs, as it sacrifices statistical
accuracy for improved performance. This patch introduces a new
configuration parameter, pg_stat_statements.sample_rate for the
pg_stat_statements extension. The patch provides the ability to control
the sampling of query statistics in pg_stat_statements.

This patch serves as a proof of concept (POC), and I would like to hear
your thoughts on whether such an approach is viable and applicable.

--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC.

Attachment Content-Type Size
0001-PATCH-Allow-setting-sample-ratio-for-pg_stat_stateme.patch text/x-patch 3.8 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Shlok Kyal 2024-11-18 19:06:25 Re: Disallow UPDATE/DELETE on table with unpublished generated column as REPLICA IDENTITY
Previous Message Emanuele Musella 2024-11-18 16:21:18 Re: Parametrization minimum password lenght