From: | Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Remove wal_[sync|write][_time] from pg_stat_wal and track_wal_io_timing |
Date: | 2025-02-24 13:08:03 |
Message-ID: | Z7xvM8rxe9dvpc8Z@ip-10-97-1-34.eu-west-3.compute.internal |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On Mon, Feb 24, 2025 at 06:15:53AM -0500, Andres Freund wrote:
> Hi,
>
> On 2025-02-24 10:54:54 +0000, Bertrand Drouvot wrote:
> > a051e71e28a added the "timing tracking" in the WAL code path with "only"
> > track_io_timing enabled (and track_wal_io_timing still false). Here, in this thread,
> > we remove unnecessary INSTR_TIME_SET_CURRENT()/INSTR_TIME_ACCUM_DIFF() calls when
> > both track_io_timing and track_wal_io_timing were enabled.
> >
> > So I think that your main concern comes from the fact that a051e71e28a added the
> > "timing tracking" in the WAL code path with "only" track_io_timing enabled.
>
> Correct.
>
>
> > That's a concern we also had and discussed in [1]. The question was: should
> > we track the WAL timing stats in pg_stat_io only when track_wal_io_timing is
> > enabled or relying only on track_io_timing would be enough?
> >
> > We ran several benchmarks ([2], [3] and [4]) and based on the results we reached
> > the conclusion that to rely only on track_io_timing to display the WAL timing
> > stats in pg_stat_io was "good" enough.
>
> Sorry to be blunt, but those benchmarks miss the mark. The benchmark
>
> - emits WAL in a transactional fashion, with fairly small transactions, where
> the disk is fairly slow. It's absolutely guaranteed that the bottleneck is
> just the WAL flush disk time. You're doing ~1k-2k TPS - the timing overhead
> would have to be *ginormous* to show up.
>
> - emits WAL at a low concurrency thus lock contention not really an issue.
>
> - likely is executed so on a system with very cheap timing functions - but
> often production workloads unfortunately don't. While it's not quite as
> common as it used to be, virtualized systems often have considerably slower
> clocks.
>
>
> On this system it'll likely be fine overhead-wise:
>
> $ echo tsc | sudo tee /sys/devices/system/clocksource/clocksource0/current_clocksource
> tsc
>
> $ src/bin/pg_test_timing/pg_test_timing
> Testing timing overhead for 3 seconds.
> Per loop time including overhead: 19.17 ns
> Histogram of timing durations:
> < us % of total count
> 1 98.08521 153484414
> 2 1.91421 2995376
> 4 0.00040 619
> 8 0.00014 225
> 16 0.00002 35
> 32 0.00001 9
> 64 0.00001 10
> 128 0.00000 3
> 256 0.00000 2
> 512 0.00000 1
>
>
> On this system however, I'd not want to bet on it:
>
> $ echo hpet | sudo tee /sys/devices/system/clocksource/clocksource0/current_clocksource
> $ src/bin/pg_test_timing/pg_test_timing
> Testing timing overhead for 3 seconds.
> Per loop time including overhead: 1228.91 ns
> Histogram of timing durations:
> < us % of total count
> 1 0.15574 3802
> 2 78.51065 1916598
> 4 21.26778 519188
> 8 0.02069 505
> 16 0.03957 966
> 32 0.00356 87
> 64 0.00193 47
> 128 0.00008 2
>
>
I agree that we've to put everything in the picture (system with or without
cheap timing functions, lock contention and WAL flush disk time) and that we
can certainly find configuration/workload that would get benefits from a
dedicated track_wal_io_timing GUC.
PFA a patch to re-introduce the track_wal_io_timing GUC and to ensure that the
WAL write and fsync timing activities are tracked in pg_stat_io (and
pg_stat_get_backend_io()) only if both track_io_timing and track_wal_io_timing
are enabled.
Note that to change the a051e71e28a behavior, the attached patch adds an extra
"track_io_guc" parameter to pgstat_count_io_op_time() (like pgstat_prepare_io_time
already had in a051e71e28a).
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachment | Content-Type | Size |
---|---|---|
v1-0001-re-introduce-track_wal_io_timing.patch | text/x-diff | 18.7 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | wenhui qiu | 2025-02-24 13:08:06 | Re: Anti join confusion |
Previous Message | Rahila Syed | 2025-02-24 12:46:45 | Re: Enhancing Memory Context Statistics Reporting |