Re: WALWriteLocks

From: Don Seiler <don(at)seiler(dot)us>
To: Vijaykumar Jain <vijaykumarjain(dot)github(at)gmail(dot)com>
Cc: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, pgsql-admin <pgsql-admin(at)postgresql(dot)org>
Subject: Re: WALWriteLocks
Date: 2021-05-24 15:05:30
Message-ID: CAHJZqBAsRBh+cytzVeD-xGvzutJEX9ze0SN3c7w-OUVyRqYL1Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Not resolved. The WAL was already on SSD, we further moved it to a separate
SSD (Azure's Ultra SSD) and that seemed to lessen the severity/frequency
but we still see occasional spikes over the course of any given day.

What is actually interesting though is now that I've had iostat running,
when we see the WALWriteLock spikes, we see the w_await and aqu-sz spike on
the data directory disk, not the WAL disk. See below for an example (first
two readings for baseline, then spike). sdc is the PG data directory disk
(Azure Premium SSD), and dm-0 is an LVM lun wrapped around it. sdd is the
WAL disk (Azure Ultra SSD). The data disk is set to ReadOnly host caching
per Azure's recommendations, the WAL disk is set to None (per
recommendations for log disk plus Ultra SSD doesn't allow any caching
anyway).

05/12/2021 01:45:47 PM
avg-cpu: %user %nice %system %iowait %steal %idle
17.77 0.00 3.61 1.67 0.00 76.95

Device r/s w/s rMB/s wMB/s rrqm/s wrqm/s %rrqm
%wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm %util
loop0 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdb 0.00 3.20 0.00 0.07 0.00 14.60 0.00
82.02 0.00 1.69 0.00 0.00 22.25 3.25 1.04
sdc 1180.00 182.80 11.39 1.41 0.00 0.00 0.00
0.00 0.71 5.22 0.77 9.88 7.93 0.67 91.20
sdd 0.00 1290.00 0.00 19.82 0.00 250.00 0.00
16.23 0.00 0.46 0.01 0.00 15.74 0.76 98.56
dm-0 1180.40 182.80 11.39 1.41 0.00 0.00 0.00
0.00 0.74 5.30 1.84 9.88 7.93 0.67 91.28

05/12/2021 01:45:52 PM
avg-cpu: %user %nice %system %iowait %steal %idle
18.71 0.00 4.02 1.54 0.00 75.74

Device r/s w/s rMB/s wMB/s rrqm/s wrqm/s %rrqm
%wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm %util
loop0 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdb 0.00 3.40 0.00 0.07 0.00 15.00 0.00
81.52 0.00 1.88 0.00 0.00 21.65 3.53 1.20
sdc 1035.20 118.60 10.55 1.05 0.00 26.80 0.00
18.43 0.67 5.27 0.50 10.44 9.07 0.81 93.44
sdd 0.00 1298.00 0.00 21.64 0.00 264.40 0.00
16.92 0.00 0.49 0.01 0.00 17.08 0.76 98.96
dm-0 1035.20 145.40 10.55 1.05 0.00 0.00 0.00
0.00 0.69 5.49 1.52 10.44 7.39 0.79 93.44

05/12/2021 01:45:57 PM
avg-cpu: %user %nice %system %iowait %steal %idle
17.85 0.00 4.34 10.93 0.00 66.88

Device r/s w/s rMB/s wMB/s rrqm/s wrqm/s %rrqm
%wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm %util
loop0 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdb 0.00 3.80 0.00 0.07 0.00 14.60 0.00
79.35 0.00 2.26 0.00 0.00 20.00 3.58 1.36
sdc 983.20 2831.80 8.98 36.91 0.00 0.00 0.00
0.00 6.61 38.56 108.97 9.35 13.35 0.25 94.80
sdd 0.00 1246.20 0.00 21.24 0.00 249.00 0.00
16.65 0.00 0.49 0.03 0.00 17.45 0.79 98.48
dm-0 982.80 2831.60 8.98 36.91 0.00 0.00 0.00
0.00 6.58 38.66 115.93 9.35 13.35 0.25 94.88

05/12/2021 01:46:02 PM
avg-cpu: %user %nice %system %iowait %steal %idle
19.40 0.00 4.48 2.53 0.00 73.59

Device r/s w/s rMB/s wMB/s rrqm/s wrqm/s %rrqm
%wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm %util
loop0 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
sdb 2.20 4.00 0.04 0.09 0.00 20.00 0.00
83.33 0.36 2.20 0.00 17.45 24.00 2.32 1.44
sdc 915.00 2592.60 8.31 23.10 0.00 44.20 0.00
1.68 1.20 16.35 37.69 9.30 9.13 0.27 94.64
sdd 0.00 1234.20 0.00 24.99 0.00 253.60 0.00
17.05 0.00 0.54 0.02 0.00 20.74 0.80 99.20
dm-0 915.60 2636.80 8.31 23.10 0.00 0.00 0.00
0.00 1.20 16.19 43.80 9.30 8.97 0.27 94.64

On Mon, May 24, 2021 at 9:13 AM Vijaykumar Jain <
vijaykumarjain(dot)github(at)gmail(dot)com> wrote:

> Hey Don,
>
> Was this resolved?
> We're there any changes after moving wal to ssd ?
>
> On Wed, May 5, 2021, 10:14 PM Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
> wrote:
>
>> On Wed, 2021-05-05 at 11:25 -0500, Don Seiler wrote:
>> > What I want to know is if there are other events that would contribute
>> > to a WALWriteLock (eg writing to a data file or the postgresql.log
>> file)
>> > or is it 100% writing to the WAL file.
>>
>> You don't contribute to a lock...
>>
>> The documentation says
>> (
>> https://www.postgresql.org/docs/13/monitoring-stats.html#WAIT-EVENT-LOCK-TABLE
>> ):
>>
>> > WALWrite Waiting for WAL buffers to be written to disk.
>>
>> So it is contention while writing or flushing WAL.
>> You can read the code for the details.
>>
>> Yours,
>> Laurenz Albe
>> --
>> Cybertec | https://www.cybertec-postgresql.com
>>
>>

--
Don Seiler
www.seiler.us

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Vijaykumar Jain 2021-05-24 15:38:13 Re: WALWriteLocks
Previous Message Vijaykumar Jain 2021-05-24 14:13:15 Re: WALWriteLocks