Re: Commit Latency

From: ravi k <ravisql09(at)gmail(dot)com>
To: Tomas Vondra <tomas(at)vondra(dot)me>
Cc: Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Ramakrishna m <ram(dot)pgdb(at)gmail(dot)com>, pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Commit Latency
Date: 2025-02-08 08:16:44
Message-ID: CAFL4M8EBX4u0nTKAe4r2LHyyuGKELALcOYs3fPQ6tQdeSnw-Rg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Please find the detail below:

Our system is of 4TB(SSD shared pure storage) and *xfs *filesystem:

- Also handling 300 TPS (~3500 DML queries)
- 80% of modified data is BLOB (text)
- Table sizes generally under 70GB
- Unusual naptime configuration due to data characteristics
- Idle in TX expected regularly ~1ms
- Action items are in pipeline:
a. Adjusting vacuum parameters
b. Planning manual vacuum jobs during off-hours
- Note: App team is working on refactoring, but this will take time

Pg_stat_activity o/p:
state | count
---------------------+-------
| 5
active | 9
idle in transaction | 2
idle | 212
(4 rows)

Laurenz, thanks for the article.

Alvaro, regarding pg_wait_sampling, we are unable to enable it at this time
due to the requirement for a server restart.

Tomas, please find the fsync log attached below for your review.Let me know
if you need any further information or clarification.

5 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.

Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync is Linux's default)
open_datasync 3530.460 ops/sec 283 usecs/op
fdatasync 3477.850 ops/sec 288 usecs/op
fsync 2990.668 ops/sec 334 usecs/op
fsync_writethrough n/a
open_sync 3287.932 ops/sec 304 usecs/op

Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync is Linux's default)
open_datasync 1752.086 ops/sec 571 usecs/op
fdatasync 3198.132 ops/sec 313 usecs/op
fsync 2980.334 ops/sec 336 usecs/op
fsync_writethrough n/a
open_sync 1555.356 ops/sec 643 usecs/op

Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB in different write
open_sync sizes.)
1 * 16kB open_sync write 2898.926 ops/sec 345 usecs/op
2 * 8kB open_sync writes 1631.841 ops/sec 613 usecs/op
4 * 4kB open_sync writes 861.491 ops/sec 1161 usecs/op
8 * 2kB open_sync writes 451.548 ops/sec 2215 usecs/op
16 * 1kB open_sync writes 233.062 ops/sec 4291 usecs/op

Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written on a different
descriptor.)
write, fsync, close 2747.990 ops/sec 364 usecs/op
write, close, fsync 2804.104 ops/sec 357 usecs/op

Non-sync'ed 8kB writes:
write 386932.826 ops/sec 3 usecs/op

*We suspect issues with wal_init_zero due to a 3-minute log commit spike
coinciding with increased WAL creation time, as shown in the attached
graphs.*

*Note: Size of the wal file is 1GB and generates wal of 3GB/min*
active | IO: *WALInitWrite *| 00:00:00.249311 | COMMIT
IO | *WALInitSync *| active 474.163 ms| COMMIT

Commit spike timings:

[image: image.png]

Wal creation_timings:

[image: image.png]

Best,
Ravi

On Fri, Feb 7, 2025 at 10:14 PM Tomas Vondra <tomas(at)vondra(dot)me> wrote:

> On 2/7/25 14:00, ravi k wrote:
> > Thanks for the suggestions!
> >
> > It looks the issue is happening at the time of wal creation, does
> > wal_init_zero off is good option?
> >
>
> It's impossible to say if that option will help, considering how little
> info about the system you provided. I suggest you start by answering the
> questions you've been asked by me & Alvaro, explain why you concluded
> it's related to WAL creation, etc.
>
>
> regards
>
> --
> Tomas Vondra
>
>

In response to

Browse pgsql-general by date

  From Date Subject
Previous Message Robert Leach 2025-02-07 22:02:03 Re: How to perform a long running dry run transaction without blocking