From: | Rahila Syed <rahilasyed90(at)gmail(dot)com> |
---|---|
To: | Tomas Vondra <tomas(at)vondra(dot)me> |
Cc: | torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Enhancing Memory Context Statistics Reporting |
Date: | 2025-01-21 11:27:14 |
Message-ID: | CAH2L28vR622wV44XenbhWc7ETpNtjS_oeTha7OxMx35LjWPPqQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Tomas,
I've tried the pgbench test
> again, to see if it gets stuck somewhere, and I'm observing this on a
> new / idle cluster:
>
> $ pgbench -n -f test.sql -P 1 test -T 60
> pgbench (18devel)
> progress: 1.0 s, 1647.9 tps, lat 0.604 ms stddev 0.438, 0 failed
> progress: 2.0 s, 1374.3 tps, lat 0.727 ms stddev 0.386, 0 failed
> progress: 3.0 s, 1514.4 tps, lat 0.661 ms stddev 0.330, 0 failed
> progress: 4.0 s, 1563.4 tps, lat 0.639 ms stddev 0.212, 0 failed
> progress: 5.0 s, 1665.0 tps, lat 0.600 ms stddev 0.177, 0 failed
> progress: 6.0 s, 1538.0 tps, lat 0.650 ms stddev 0.192, 0 failed
> progress: 7.0 s, 1491.4 tps, lat 0.670 ms stddev 0.261, 0 failed
> progress: 8.0 s, 1539.5 tps, lat 0.649 ms stddev 0.443, 0 failed
> progress: 9.0 s, 1517.0 tps, lat 0.659 ms stddev 0.167, 0 failed
> progress: 10.0 s, 1594.0 tps, lat 0.627 ms stddev 0.227, 0 failed
> progress: 11.0 s, 28.0 tps, lat 0.705 ms stddev 0.277, 0 failed
> progress: 12.0 s, 0.0 tps, lat 0.000 ms stddev 0.000, 0 failed
> progress: 13.0 s, 0.0 tps, lat 0.000 ms stddev 0.000, 0 failed
> progress: 14.0 s, 0.0 tps, lat 0.000 ms stddev 0.000, 0 failed
> progress: 15.0 s, 0.0 tps, lat 0.000 ms stddev 0.000, 0 failed
> progress: 16.0 s, 1480.6 tps, lat 4.043 ms stddev 130.113, 0 failed
> progress: 17.0 s, 1524.9 tps, lat 0.655 ms stddev 0.286, 0 failed
> progress: 18.0 s, 1246.0 tps, lat 0.802 ms stddev 0.330, 0 failed
> progress: 19.0 s, 1383.1 tps, lat 0.722 ms stddev 0.934, 0 failed
> progress: 20.0 s, 1432.7 tps, lat 0.698 ms stddev 0.199, 0 failed
> ...
>
> There's always a period of 10-15 seconds when everything seems to be
> working fine, and then a couple seconds when it gets stuck, with the usual
>
> LOG: Wait for 69454 process to publish stats timed out, trying again
>
> The PIDs I've seen were for checkpointer, autovacuum launcher, ... all
> of that are processes that should be handling the signal, so how come it
> gets stuck every now and then? The system is entirely idle, there's no
> contention for the shmem stuff, etc. Could it be forgetting about the
> signal in some cases, or something like that?
>
> Yes, This occurs when, due to concurrent signals received by a backend,
both signals are processed together, and stats are published only once.
Once the stats are read by the first client that gains access, they are
erased,
causing the second client to wait until timeout.
If we make clients wait for the latest stats, timeouts may occur during
concurrent
operations. To avoid such timeouts, we can retain the previously published
memory
statistics for every backend and avoid waiting for the latest statistics
when the
previous statistics are newer than STALE_STATS_LIMIT. This limit can be
determined
based on the server load and how fast the memory statistics requests are
being
handled by the server.
For example, on a server running make -j 4 installcheck-world while
concurrently
probing client backends for memory statistics using pgbench, accepting
statistics
that were approximately 1 second old helped eliminate timeouts. Conversely,
on an
idle system, waiting for new statistics when the previous ones were older
than 0.1
seconds was sufficient to avoid any timeouts caused by concurrent requests.
PFA an updated and rebased patch that includes the capability to associate
timestamps with statistics. Additionally, I have made some minor fixes and
improved
the indentation.
Currently, I have set STALE_STATS_LIMIT to 0.5 seconds in code. which means
do not
do not wait for newer statistics if previous statistics were published
within the last
5 seconds of current request.
Inshort, there are following options to design the wait for statistics
depending on whether
we expect concurrent requests to a backend for memory statistics to be
common.
1. Always get the latest statistics and timeout if not able to.
This works fine for sequential probing which is going to be the most common
use case.
This can lead to a backend timeouts upto MAX_TRIES * MEMSTATS_WAIT_TIMEOUT.
2. Determine the appropriate STALE_STATS_LIMIT and not wait for the latest
stats if
previous statistics are within that limit .
This will help avoid the timeouts in case of the concurrent requests.
3. Do what v10 patch on this thread does -
Wait for the latest statistics for up to MEMSTATS_WAIT_TIMEOUT;
otherwise, display the previous statistics, regardless of when they were
published.
Since timeouts are likely to occur only during concurrent requests, the
displayed
statistics are unlikely to be very outdated.
However, in this scenario, we observe the behavior you mentioned, i.e.,
concurrent
backends can get stuck for the duration of MEMSTATS_WAIT_TIMEOUT
(currently 5 seconds as per the current settings).
I am inclined toward the third approach, as concurrent requests are not
expected
to be a common use case for this feature. Moreover, with the second
approach,
determining an appropriate value for STALE_STATS_LIMIT is challenging, as
it
depends on the server's load.
Kindly let me know your preference. I have attached a patch which
implements the
2nd approach for testing, the 3rd approach being implemented in the v10
patch.
Thank you,
Rahila Syed
Attachment | Content-Type | Size |
---|---|---|
v11-0001-Function-to-report-memory-context-stats-of-any-backe.patch | application/octet-stream | 46.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Daniel Gustafsson | 2025-01-21 11:39:27 | Re: Replace current implementations in crypt() and gen_salt() to OpenSSL |
Previous Message | Ritu Bhandari | 2025-01-21 11:15:59 | Re: Purpose of wal_init_zero |