Re: Sample rate added to pg_stat_statements

From: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>
To: Sami Imseih <samimseih(at)gmail(dot)com>, "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>
Cc: Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Greg Sabino Mullane <htamfids(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Sample rate added to pg_stat_statements
Date: 2025-01-07 11:05:54
Message-ID: 20994d10-ab44-4c6b-91bc-c4675cd2e58e@tantorlabs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 06.01.2025 18:57, Andrey M. Borodin wrote:
>> On 6 Jan 2025, at 15:50, Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com> wrote:
>>
>> Any suggestions for improvements?
> The patch looks good to me, just a few nits.
>
> 0. Perhaps, we could have a test for edge values of 0 and 1. I do not insist, just an idea to think about.

I would add tests, because they are never useless. I've added a simple
test which verifies hash table with queries after setting sample_rate =
0.0 and sample_rate = 1.0.

>
> 1. This code seems a little different from your patch. It is trying to avoid engaging PRNG. I'm not sure it's a good idea, but still. Also, it uses "<=", not "<".
>
> xact_is_sampled = log_xact_sample_rate != 0 &&
> (log_xact_sample_rate == 1 ||
> pg_prng_double(&pg_global_prng_state) <= log_xact_sample_rate);
>
> Thanks!
>
>
> Best regards, Andrey Borodin.
>

Are we sure we're discussing the same patch? Because these remarks refer
to the 5 version of the patch, which I abandoned due to your remarks.

On 06.01.2025 20:51, Sami Imseih wrote:
> Hi,
>
> I was looking at this patch, and I was specifically curious about
> how this works with prepared statements. The way the patch
> works now is that it determines if the statement is to be sampled
> at post_parse_analyze time which could lead to unexpected
> behavior.
>
> Let's take an example below in which the
> pg_stat_statements.sample_rate is set to 0 ( to mimic
> some sampling rate < 1 in which this query does not
> get sampled ). At that point, all subsequent executions
> of the statement will not get tracked at all. Is this
> what is expected for prepared statements? My concern
> is we will even lose more stats than what a user
> may expect.
>
> This of course will not be an issue for simple query.
>
> postgres=# set pg_stat_statements.sample_rate = 0;
> SET
> postgres=# select pg_stat_statements_reset();
> pg_stat_statements_reset
> -------------------------------
> 2025-01-06 11:45:23.484793-06
> (1 row)
>
> postgres=# SELECT $1 \parse stmt
>
> postgres=#
> postgres=# \bind_named stmt 1 \g
> ?column?
> ----------
> 1
> (1 row)
>
> postgres=# \bind_named stmt 1 \g
> ?column?
> ----------
> 1
> (1 row)
>
> postgres=# set pg_stat_statements.sample_rate = 1;
> SET
> postgres=# \bind_named stmt 1 \g
> ?column?
> ----------
> 1
> (1 row)
>
> postgres=# \bind_named stmt 1 \g
> ?column?
> ----------
> 1
> (1 row)
>
> postgres=# SELECT query, calls FROM pg_stat_statements;
> query | calls
> -------+-------
> (0 rows)
>
> Regards,
>
> Sami Imseih
> Amazon Web Services (AWS)
>
>

You are right. This is absolutely unexpected for users. Thank you for
the review.

To fix this, I suggest storing a random number in the [0, 1) range in a
separate variable, which will be used for comparisons in any place. We
will compare 'sample_rate' and random value not only in
pgss_post_parse_analyze(), but also in pgss_ProcessUtility() and
pgss_planner().

I attached patch with test and fixes.

If you have any objections or suggestions on how to improve it, I'm
always open to feedback.

--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC.

Attachment Content-Type Size
v7-0001-Allow-setting-sample-ratio-for-pg_stat_statements.patch text/x-patch 6.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2025-01-07 11:11:14 Re: Conflict detection for update_deleted in logical replication
Previous Message Dean Rasheed 2025-01-07 10:56:38 Re: Adding OLD/NEW support to RETURNING