Re: Sample rate added to pg_stat_statements

From: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>
To: Sami Imseih <samimseih(at)gmail(dot)com>
Cc: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Greg Sabino Mullane <htamfids(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Sample rate added to pg_stat_statements
Date: 2025-01-28 20:50:48
Message-ID: cc1beb52-3e4e-4866-8c32-a5967b98c977@tantorlabs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 28.01.2025 20:21, Sami Imseih wrote:
>> All the changes mentioned above are included in the v13 patch. Since the
>> patch status is 'Ready for Committer,' I believe it is now better for
>> upstream inclusion, with improved details in tests and documentation. Do
>> you have any further suggestions?
> I am not quite clear on the sample_1.out. I do like the idea of separating
> the sample tests, but I was thinking of something a bit more simple.
> What do you think of my attached, sampling.sql, test? It tests sample
> rate in both
> simple and extended query protocols and for both top level and
> nested levels?

That sounds great! I've added your sample.sql file to my v14
patch. However, I was focused on testing sample_rate values between 0
and 1. The approach I came up with was using the sample{_1}.out files. 
I’ve removed the test involving those files for now, but if the
committer prefers to keep it, I can reintroduce them.

>
>> If anyone has the capability to run this benchmark on machines with more
>> CPUs or with different queries, it would be nice. I’d appreciate any
>> suggestions or feedback.
> I wanted to share some additional benchmarks I ran as well
> on a r8g.48xlarge ( 192 vCPUs, 1,536 GiB of memory) configured
> with 16GB of shared_buffers. I also attached the benchmark.sh
> script used to generate the output.
> The benchmark is running the select-only pgbench workload,
> so we have a single heavily contentious entry, which is the
> worst case.
>
> The test shows that the spinlock (SpinDelay waits)
> becomes an issue at high connection counts and will
> become worse on larger machines. A sample_rate going from
> 1 to .75 shows a 60% improvement; but this is on a single
> contentious entry. Most workloads will likely not see this type
> of improvement. I also could not really observe
> this type of difference on smaller machines ( i.e. 32 vCPUs),
> as expected.
>
> ## init
> pgbench -i -s500
>
> ### 192 connections
> pgbench -c192 -j20 -S -Mprepared -T120 --progress 10
>
> sample_rate = 1
> tps = 484338.769799 (without initial connection time)
> waits
> -----
> 11107 SpinDelay
> 9568 CPU
> 929 ClientRead
> 13 DataFileRead
> 3 BufferMapping
>
> sample_rate = .75
> tps = 909547.562124 (without initial connection time)
> waits
> -----
> 12079 CPU
> 4781 SpinDelay
> 2100 ClientRead
>
> sample_rate = .5
> tps = 1028594.555273 (without initial connection time)
> waits
> -----
> 13253 CPU
> 3378 ClientRead
> 174 SpinDelay
>
> sample_rate = .25
> tps = 1019507.126313 (without initial connection time)
> waits
> -----
> 13397 CPU
> 3423 ClientRead
>
> sample_rate = 0
> tps = 1015425.288538 (without initial connection time)
> waits
> -----
> 13106 CPU
> 3502 ClientRead
>
> ### 32 connections
> pgbench -c32 -j20 -S -Mprepared -T120 --progress 10
>
> sample_rate = 1
> tps = 620667.049565 (without initial connection time)
> waits
> -----
> 1782 CPU
> 560 ClientRead
>
> sample_rate = .75
> tps = 620663.131347 (without initial connection time)
> waits
> -----
> 1736 CPU
> 554 ClientRead
>
> sample_rate = .5
> tps = 624094.688239 (without initial connection time)
> waits
> -----
> 1741 CPU
> 648 ClientRead
>
> sample_rate = .25
> tps = 628638.538204 (without initial connection time)
> waits
> -----
> 1702 CPU
> 576 ClientRead
>
> sample_rate = 0
> tps = 630483.464912 (without initial connection time)
> waits
> -----
> 1638 CPU
> 574 ClientRead
>
> Regards,
>
> Sami

Thank you so much for benchmarking this on a pretty large machine with a
large number of CPUs. The results look fantastic, and I truly appreciate
your effort.

BWT, I realized that the 'sampling' test needs to be added not only to
the Makefile but also to meson.build. I've included that in the v14 patch.

--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC.

Attachment Content-Type Size
v14-0001-Allow-setting-sample-rate-for-pg_stat_statements.patch text/x-patch 13.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2025-01-28 21:15:23 Re: Interrupts vs signals
Previous Message Robert Treat 2025-01-28 18:48:56 Re: Eagerly scan all-visible pages to amortize aggressive vacuum