Re: Sample rate added to pg_stat_statements

From: Ilia Evdokimov <ilya(dot)evdokimov(at)tantorlabs(dot)com>
To: Sami Imseih <samimseih(at)gmail(dot)com>
Cc: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Greg Sabino Mullane <htamfids(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Sample rate added to pg_stat_statements
Date: 2025-01-30 10:58:47
Message-ID: 4f98ec33-c627-4eb4-84f3-9994a0982277@tantorlabs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 29.01.2025 21:52, Ilia Evdokimov wrote:
>
>>> ... I also attached the benchmark.sh
>>> script used to generate the output.
>
>
> In my opinion, if we can't observe bottleneck of spinlock on 32 CPUs,
> we should determine the CPU count at which it becomes. This will help
> us understand the scale of the problem. Does this make sense, or are
> there really no real workloads where the same query runs on more than
> 32 CPUs, and we've been trying to solve a non-existent problem?

I ran the same benchmark on 48 CPUs for -c 48 -j 20 for objectivity.

### 48 connections
pgbench -c48 -j20 -S -Mprepared -T120 --progress 10

sample_rate = 1
tps = 643251.640175 (without initial connection time)
waits
-----
    932  ClientRead
    911  CPU
     44  SpinDelay

sample_rate = .75
tps = 653946.777122 (without initial connection time)
waits
-----
    939  CPU
    875  ClientRead
      3  SpinDelay

sample_rate = .5
tps = 651654.348463 (without initial connection time)
waits
-----
    932  ClientRead
    841  CPU

sample_rate = .25
tps = 652668.807245 (without initial connection time)
waits
-----
    910  ClientRead
    860  CPU

sample_rate = 0
tps = 659111.347019 (without initial connection time)
waits
-----
    882  ClientRead
    849  CPU

There is a small amount ofSpinDelay, as the user mentioned. However, we
can identify the threshold where the problem appears.

To summarize the results of all benchmarks, I compiled them into a table:

 CPUs | sample_rate | tps | CPU waits | ClientRead wait | SpinDelay wait
  192 |         1.0 | 484338|      9568 |             929 | 11107
  192 |        0.75 | 909547|      12079 |            2100 | 4781
  192 |         0.5 |1028594|     13253 |            3378 | 174
  192 |        0.25 |1019507|     13397 |            3423 | -
  192 |         0.0 |1015425|     13106 |            3502 | -

   48 |         1.0 | 643251|       911 |             932 | 44
   48 |        0.75 | 653946|       939 |             939 |  3
   48 |         0.5 | 651654|       841 |             932 | -
   48 |        0.25 | 652668|       860 |             910 | -
   48 |         0.0 | 659111|       849 |             882 | -

   32 |         1.0 | 620667|      1782 |             560 | -
   32 |        0.75 | 620667|      1736 |             554 | -
   32 |         0.5 | 624094|      1741 |             648 | -
   32 |        0.25 | 628638|      1702 |             576 | -
   32 |         0.0 | 630483|      1638 |             574 | -

--
Best regards,
Ilia Evdokimov,
Tantor Labs LLC.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Srinath Reddy 2025-01-30 11:17:17 Re: getting "shell command argument contains a newline or carriage return:" error with pg_dumpall when db name have new line in double quote
Previous Message Amit Kapila 2025-01-30 10:31:55 Re: Improve error handling for invalid slots and ensure a same 'inactive_since' time for inactive slots