Quick Links

Re: Sample rate added to pg_stat_statements

From:	Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>
To:	Sami Imseih <samimseih(at)gmail(dot)com>, "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>
Cc:	Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Greg Sabino Mullane <htamfids(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Sample rate added to pg_stat_statements
Date:	2025-01-15 09:47:33
Message-ID:	cebfee9c-a894-46f1-b6bc-5bf4a0753b41@tantorlabs.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 15.01.2025 01:07, Sami Imseih wrote:
>> Alena, Sami – I apologize for not including you in the previous email.
>> If you're interested in this approach, I'm open to any suggestions.
>>
>> [0]:
>> https://www.postgresql.org/message-id/1b13d748-5e98-479c-9222-3253a734a038%40tantorlabs.com
> Here are my thoughts on this:
>
> There is good reason to apply sample rate selectively,
> but I am not sure if execution time is the way to go. I
> would rather apply a sample rate on the most frequently
> called queries, since I can gather enough samples
> to draw conclusions about performance of the query.
> I just don't know if that can be done in a sensible way,
> because while we can check the number of calls in the entry,
> we will need to do that with a shared lock and spin lock,
> which will defeat the purpose of this patch.

Agree. Indeed, we should reduce the load on the spin locks, but we can’t
check the most popular called queries without inspecting the hash table
and locking spin locks.

>
> This also got me thinking if maybe we should
> allow the user to disable sample rate for
> utility statements as those are less frequent
> in most workloads and a user may want to capture
> all such statements, i.e. DROP, CREATE, etc.
>
> Regards,
>
> Sami
>

Probably, but first I suggest benchmarking with sampling applied to all
queries. If the results are good, we can later filter certain queries
based on different characteristics.

On 06.01.2025 18:57, Andrey M. Borodin wrote:
> 1. This code seems a little different from your patch. It is trying to avoid engaging PRNG. I'm not sure it's a good idea, but still. Also, it uses "<=", not "<".
>
> xact_is_sampled = log_xact_sample_rate != 0 &&
> (log_xact_sample_rate == 1 ||
> pg_prng_double(&pg_global_prng_state) <= log_xact_sample_rate);

Sorry for the delayed reply. Andrey was right about this
suggestion—first, it makes the code more readable for others, and
second, it avoids engaging the PRNG on edge values of 0.0 and 1.0. I’ve
attached patch v11 with these changes.

--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC.

Attachment	Content-Type	Size
v11-0001-Allow-setting-sample-ratio-for-pg_stat_statements.patch	text/x-patch	8.7 KB

In response to

Re: Sample rate added to pg_stat_statements at 2025-01-14 22:07:36 from Sami Imseih

Responses

Re: Sample rate added to pg_stat_statements at 2025-01-15 17:16:01 from Sami Imseih

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Ajin Cherian	2025-01-15 10:24:38	Re: Adding a '--two-phase' option to 'pg_createsubscriber' utility.
Previous Message	Alexander Kukushkin	2025-01-15 09:35:42	Re: Infinite loop in XLogPageRead() on standby