From: | Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> |
---|---|
To: | Rahila Syed <rahilasyed90(at)gmail(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me> |
Cc: | torikoshia <torikoshia(at)oss(dot)nttdata(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Enhancing Memory Context Statistics Reporting |
Date: | 2025-01-21 16:31:53 |
Message-ID: | 70146bd0-da0f-48ee-bece-1e35536189e0@oss.nttdata.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2025/01/21 20:27, Rahila Syed wrote:
> Hi Tomas,
>
> I've tried the pgbench test
> again, to see if it gets stuck somewhere, and I'm observing this on a
> new / idle cluster:
>
> $ pgbench -n -f test.sql -P 1 test -T 60
> pgbench (18devel)
> progress: 1.0 s, 1647.9 tps, lat 0.604 ms stddev 0.438, 0 failed
> progress: 2.0 s, 1374.3 tps, lat 0.727 ms stddev 0.386, 0 failed
> progress: 3.0 s, 1514.4 tps, lat 0.661 ms stddev 0.330, 0 failed
> progress: 4.0 s, 1563.4 tps, lat 0.639 ms stddev 0.212, 0 failed
> progress: 5.0 s, 1665.0 tps, lat 0.600 ms stddev 0.177, 0 failed
> progress: 6.0 s, 1538.0 tps, lat 0.650 ms stddev 0.192, 0 failed
> progress: 7.0 s, 1491.4 tps, lat 0.670 ms stddev 0.261, 0 failed
> progress: 8.0 s, 1539.5 tps, lat 0.649 ms stddev 0.443, 0 failed
> progress: 9.0 s, 1517.0 tps, lat 0.659 ms stddev 0.167, 0 failed
> progress: 10.0 s, 1594.0 tps, lat 0.627 ms stddev 0.227, 0 failed
> progress: 11.0 s, 28.0 tps, lat 0.705 ms stddev 0.277, 0 failed
> progress: 12.0 s, 0.0 tps, lat 0.000 ms stddev 0.000, 0 failed
> progress: 13.0 s, 0.0 tps, lat 0.000 ms stddev 0.000, 0 failed
> progress: 14.0 s, 0.0 tps, lat 0.000 ms stddev 0.000, 0 failed
> progress: 15.0 s, 0.0 tps, lat 0.000 ms stddev 0.000, 0 failed
> progress: 16.0 s, 1480.6 tps, lat 4.043 ms stddev 130.113, 0 failed
> progress: 17.0 s, 1524.9 tps, lat 0.655 ms stddev 0.286, 0 failed
> progress: 18.0 s, 1246.0 tps, lat 0.802 ms stddev 0.330, 0 failed
> progress: 19.0 s, 1383.1 tps, lat 0.722 ms stddev 0.934, 0 failed
> progress: 20.0 s, 1432.7 tps, lat 0.698 ms stddev 0.199, 0 failed
> ...
>
> There's always a period of 10-15 seconds when everything seems to be
> working fine, and then a couple seconds when it gets stuck, with the usual
>
> LOG: Wait for 69454 process to publish stats timed out, trying again
>
> The PIDs I've seen were for checkpointer, autovacuum launcher, ... all
> of that are processes that should be handling the signal, so how come it
> gets stuck every now and then? The system is entirely idle, there's no
> contention for the shmem stuff, etc. Could it be forgetting about the
> signal in some cases, or something like that?
>
> Yes, This occurs when, due to concurrent signals received by a backend,
> both signals are processed together, and stats are published only once.
> Once the stats are read by the first client that gains access, they are erased,
> causing the second client to wait until timeout.
>
> If we make clients wait for the latest stats, timeouts may occur during concurrent
> operations. To avoid such timeouts, we can retain the previously published memory
> statistics for every backend and avoid waiting for the latest statistics when the
> previous statistics are newer than STALE_STATS_LIMIT. This limit can be determined
> based on the server load and how fast the memory statistics requests are being
> handled by the server.
>
> For example, on a server running make -j 4 installcheck-world while concurrently
> probing client backends for memory statistics using pgbench, accepting statistics
> that were approximately 1 second old helped eliminate timeouts. Conversely, on an
> idle system, waiting for new statistics when the previous ones were older than 0.1
> seconds was sufficient to avoid any timeouts caused by concurrent requests.
>
> PFA an updated and rebased patch that includes the capability to associate
> timestamps with statistics. Additionally, I have made some minor fixes and improved
> the indentation.
>
> Currently, I have set STALE_STATS_LIMIT to 0.5 seconds in code. which means do not
> do not wait for newer statistics if previous statistics were published within the last
> 5 seconds of current request.
>
> Inshort, there are following options to design the wait for statistics depending on whether
> we expect concurrent requests to a backend for memory statistics to be common.
>
> 1. Always get the latest statistics and timeout if not able to.
>
> This works fine for sequential probing which is going to be the most common use case.
> This can lead to a backend timeouts upto MAX_TRIES * MEMSTATS_WAIT_TIMEOUT.
>
> 2. Determine the appropriate STALE_STATS_LIMIT and not wait for the latest stats if
> previous statistics are within that limit .
> This will help avoid the timeouts in case of the concurrent requests.
>
> 3. Do what v10 patch on this thread does -
>
> Wait for the latest statistics for up to MEMSTATS_WAIT_TIMEOUT;
> otherwise, display the previous statistics, regardless of when they were published.
>
> Since timeouts are likely to occur only during concurrent requests, the displayed
> statistics are unlikely to be very outdated.
> However, in this scenario, we observe the behavior you mentioned, i.e., concurrent
> backends can get stuck for the duration of MEMSTATS_WAIT_TIMEOUT
> (currently 5 seconds as per the current settings).
>
> I am inclined toward the third approach, as concurrent requests are not expected
> to be a common use case for this feature. Moreover, with the second approach,
> determining an appropriate value for STALE_STATS_LIMIT is challenging, as it
> depends on the server's load.
Just idea; as an another option, how about blocking new requests to
the target process (e.g., causing them to fail with an error or
returning NULL with a warning) if a previous request is still pending?
Users can simply retry the request if it fails. IMO failing quickly
seems preferable to getting stuck for a while in cases with concurrent
requests.
Regards,
--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2025-01-21 16:36:16 | Re: Eager aggregation, take 3 |
Previous Message | Andres Freund | 2025-01-21 16:23:06 | Re: Pre-allocating WAL files |