Quick Links

Re: Sample rate added to pg_stat_statements

From:	Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>
To:	Sami Imseih <samimseih(at)gmail(dot)com>
Cc:	"Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>, Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Greg Sabino Mullane <htamfids(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Sample rate added to pg_stat_statements
Date:	2025-01-20 14:20:10
Message-ID:	7da368b6-e44f-47bd-891e-236bfb51bd26@tantorlabs.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 15.01.2025 20:16, Sami Imseih wrote:
>> Probably, but first I suggest benchmarking with sampling applied to all queries. If the results are good, we can later filter certain queries based on different characteristics.
> Absolutely. The benchmark numbers to justify this feature are
> the next step. Thanks for your work on this!
>
> Regards,
>
> Sami
>
Hi hackers,

I’d like to share the results of my benchmark.

To stress the spinlock in pg_stat_statements while incrementing counters
for popular entries, it's important to use the same query repeatedly. To
avoid overloading pgss with normalization, the queries should not
contain constants. I found that using the query 'SELECT now()' works
best for this purpose.

I ran the benchmark on a machine with 48 CPUs, which may not be
sufficient to fully test sampling, but I was able to achieve around 1.5
million TPS using 'SELECT now()'. To load the CPUs to about 85-90%, I
ran pgbench with -c 45 -j 45, using a custom 'SELECT now()' in
pgbench_script.sql file containing 'SELECT now()'. The benchmark was
conducted on a newly created 'pgbench' database, , with processes like
autovacuum, fsync, and checkpoints disabled(e.g., checkpoint _timeout =
'24h').

I tested various 'sample_rate' values and compared the performance with
pgss both enabled and disabled. The detailed results are attached in the
'sample_rate.txt' file, along with pgbench progress reports every 10
seconds.

pgbench -c 45 -j 45 -T 200 -f pgbench_script.sql --progress=10 pgbench

sample_rate | num of transactions | lat | stddev | tps

        1.0 |    1207127.572580 |0.037| 0.030 | 241400836
        0.8 |    1403551.516338 |0.032| 0.031 | 280673286
        0.5 |    1658596.614064 |0.027| 0.012 | 331679344
        0.2 |    1757747.509476 |0.025| 0.008 | 351507156
        0.0 |    1760055.986259 |0.025| 0.008 | 351965559
   pgss off |    1828743.791205 |0.024| 0.008 | 365703762

If anyone has the capability to run this benchmark on machines with more
CPUs or with different queries, it would be nice. I’d appreciate any
suggestions or feedback.

--.
Best regards,
Ilia Evdokimov,
Tantor Labs LLC.

Attachment	Content-Type	Size
sample_rate.txt	text/plain	11.4 KB

In response to

Re: Sample rate added to pg_stat_statements at 2025-01-15 17:16:01 from Sami Imseih

Responses

Re: Sample rate added to pg_stat_statements at 2025-01-22 22:11:55 from Ilia Evdokimov

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Bertrand Drouvot	2025-01-20 15:01:38	Re: Log connection establishment timings
Previous Message	Kirill Reshke	2025-01-20 14:03:45	Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row