Quick Links

Re: Sample rate added to pg_stat_statements

From:	Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>
To:	Sami Imseih <samimseih(at)gmail(dot)com>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Greg Sabino Mullane <htamfids(at)gmail(dot)com>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>
Cc:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Sample rate added to pg_stat_statements
Date:	2025-02-10 08:05:29
Message-ID:	3546e2a7-5831-4c0a-996b-358d4ec483a6@tantorlabs.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi hackers,

Since current patch is in the commitfest with the status 'Ready for
committer', I’d like to summarize what it does, the problems it
addresses, and answer the key questions raised in the discussion thread.

Enabling pg_stat_statements can cause a performance drop due to two main
reasons:

* Contention on the exclusive lock, which blocks the hash table when
allocating or deallocating an entry.
* Contention on the spinlock, which blocks access to an existing entry.

The exclusive lock issue can be mitigated by normalizing similar queries
to the same queryid. There is an ongoing thread discussing this approach
for IN and ARRAY queries: [0]. Another solution is integrating
pg_stat_statements into central pgstats with the custom APIs, without
pushing the module into core [1].

However, even if we do all that, the spinlock contention problem will
persist for very frequent queries. This is exactly what I aim to solve
with sampling. Other approaches, such as replacing the spinlock with
atomic variables or an lwlock, have not yielded good results [2]. Even
track_planning was disabled by default to reduce contention, but on
high-core systems, the issue still remains.So far, I see no alternative
solution to this problem.

The benchmark results on different number of CPUs demonstrating the
impact of this patch can be found here:

pgbench -c{CPUs} -j20 -S -Mprepared -T120
CPUs | sample_rate | tps | CPU waits | ClientRead wait | SpinDelay wait
   192 | 1.0 | 484338|      9568 |             929 | 11107
   192 |    0.75 | 909547|     12079 |            2100 | 4781
   192 | 0.5 |1028594|     13253 |            3378 | 174
   192 |    0.25 |1019507|     13397 |            3423 | -
   192 | 0.0 |1015425|     13106 |            3502 | -

    48 | 1.0 | 643251|       911 |             932 | 44
    48 |    0.75 | 653946|       939 |             939 | 3
    48 | 0.5 | 651654|       841 |             932 | -
    48 |    0.25 | 652668|       860 |             910 | -
    48 | 0.0 | 659111|       849 |             882 | -

    32 | 1.0 | 620667|      1782 |             560 | -
    32 |    0.75 | 620663|      1736 |             554 | -
    32 | 0.5 | 624094|      1741 |             648 | -
    32 |    0.25 | 628638|      1702 |             576 | -
    32 | 0.0 | 630483|      1638 |             574 | -

Some suggested sampling based on execution time instead of a random
generator. While this is a good idea, it has a critical limitation.
Execution time can only be checked in pgss_ExecutorEnd. However,
pgss_planner already interacts with the spinlock, and we can not sample
based on execution time at that stage. Finding a criterion that works
for both pgss_planner and pgss_ExecutorEnd is difficult. This is why
random sampling remains the most viable option.

BTW, Instead of using a bool flag to check if we are inside
pgss_post_parse_analyze, we now pass a pgssStoreKind to pgss_enabled.
This makes the logic clearer and more readable, explicitly indicating
where we need to check whether a query should be sampled. I made it in
new v15 patch.

Are there any remaining concerns or alternative suggestions regarding
this approach? I'm happy to discuss further.

[0]: https://commitfest.postgresql.org/51/2837/
[1]: https://www.postgresql.org/message-id/Zz0MFPq1s4WFxxpL%40paquier.xyz
[2]: https://commitfest.postgresql.org/29/2634/

--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC.

Attachment	Content-Type	Size
v15-0001-Allow-setting-sample-rate-for-pg_stat_statements.patch	text/x-patch	13.7 KB

In response to

Re: Sample rate added to pg_stat_statements at 2025-02-05 10:15:51 from Ilia Evdokimov

Responses

Re: Sample rate added to pg_stat_statements at 2025-02-14 13:17:09 from Ilia Evdokimov

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	John Naylor	2025-02-10 08:18:34	Re: branch-free tuplesort partitioning
Previous Message	Richard Guo	2025-02-10 07:29:19	Re: should we have a fast-path planning for OLTP starjoins?