Re: WALWriteLocks

From: Vijaykumar Jain <vijaykumarjain(dot)github(at)gmail(dot)com>
To: Don Seiler <don(at)seiler(dot)us>
Cc: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, pgsql-admin <pgsql-admin(at)postgresql(dot)org>
Subject: Re: WALWriteLocks
Date: 2021-05-24 15:38:13
Message-ID: CAM+6J974Opbd5JWVqaSVKesp=qqbu7+7Y=fgNJov5sKy8oczEA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Thanks.

For my ref:
Virtual machine and disk performance - Azure Virtual Machines | Microsoft
Docs
<https://docs.microsoft.com/en-us/azure/virtual-machines/disks-performance>

Sorry, i just want to use all the info, i have less exp :)
do you generate large wals ?
if your checkpoint_completion_target is aggressive, it will result in a lot
of io.
if the checkpoint is spread over a larger time window, maybe it will reduce
frequent i/o pressure.
but for that log_checkpoints needs to be turned on if they are hogging the
iops.

Basics of Tuning Checkpoints - 2ndQuadrant | PostgreSQL
<https://www.2ndquadrant.com/en/blog/basics-of-tuning-checkpoints/>
Tuning Input/Output (I/O) Operations for PostgreSQL | Severalnines
<https://severalnines.com/blog/tuning-io-operations-postgresql>

I did use the above to spread checkpoints to reduce io stress once, but it
meant recovery in case of crash would take longer.

But ok, I guess I am diverging. I'll see if I can simulate the exact
messages.

On Mon, 24 May 2021 at 20:35, Don Seiler <don(at)seiler(dot)us> wrote:

> Not resolved. The WAL was already on SSD, we further moved it to a
> separate SSD (Azure's Ultra SSD) and that seemed to lessen the
> severity/frequency but we still see occasional spikes over the course of
> any given day.
>
> What is actually interesting though is now that I've had iostat running,
> when we see the WALWriteLock spikes, we see the w_await and aqu-sz spike on
> the data directory disk, not the WAL disk. See below for an example (first
> two readings for baseline, then spike). sdc is the PG data directory disk
> (Azure Premium SSD), and dm-0 is an LVM lun wrapped around it. sdd is the
> WAL disk (Azure Ultra SSD). The data disk is set to ReadOnly host caching
> per Azure's recommendations, the WAL disk is set to None (per
> recommendations for log disk plus Ultra SSD doesn't allow any caching
> anyway).
>
> 05/12/2021 01:45:47 PM
> avg-cpu: %user %nice %system %iowait %steal %idle
> 17.77 0.00 3.61 1.67 0.00 76.95
>
> Device r/s w/s rMB/s wMB/s rrqm/s wrqm/s %rrqm
> %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm %util
> loop0 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> sdb 0.00 3.20 0.00 0.07 0.00 14.60 0.00
> 82.02 0.00 1.69 0.00 0.00 22.25 3.25 1.04
> sdc 1180.00 182.80 11.39 1.41 0.00 0.00 0.00
> 0.00 0.71 5.22 0.77 9.88 7.93 0.67 91.20
> sdd 0.00 1290.00 0.00 19.82 0.00 250.00 0.00
> 16.23 0.00 0.46 0.01 0.00 15.74 0.76 98.56
> dm-0 1180.40 182.80 11.39 1.41 0.00 0.00 0.00
> 0.00 0.74 5.30 1.84 9.88 7.93 0.67 91.28
>
> 05/12/2021 01:45:52 PM
> avg-cpu: %user %nice %system %iowait %steal %idle
> 18.71 0.00 4.02 1.54 0.00 75.74
>
> Device r/s w/s rMB/s wMB/s rrqm/s wrqm/s %rrqm
> %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm %util
> loop0 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> sdb 0.00 3.40 0.00 0.07 0.00 15.00 0.00
> 81.52 0.00 1.88 0.00 0.00 21.65 3.53 1.20
> sdc 1035.20 118.60 10.55 1.05 0.00 26.80 0.00
> 18.43 0.67 5.27 0.50 10.44 9.07 0.81 93.44
> sdd 0.00 1298.00 0.00 21.64 0.00 264.40 0.00
> 16.92 0.00 0.49 0.01 0.00 17.08 0.76 98.96
> dm-0 1035.20 145.40 10.55 1.05 0.00 0.00 0.00
> 0.00 0.69 5.49 1.52 10.44 7.39 0.79 93.44
>
> 05/12/2021 01:45:57 PM
> avg-cpu: %user %nice %system %iowait %steal %idle
> 17.85 0.00 4.34 10.93 0.00 66.88
>
> Device r/s w/s rMB/s wMB/s rrqm/s wrqm/s %rrqm
> %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm %util
> loop0 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> sdb 0.00 3.80 0.00 0.07 0.00 14.60 0.00
> 79.35 0.00 2.26 0.00 0.00 20.00 3.58 1.36
> sdc 983.20 2831.80 8.98 36.91 0.00 0.00 0.00
> 0.00 6.61 38.56 108.97 9.35 13.35 0.25 94.80
> sdd 0.00 1246.20 0.00 21.24 0.00 249.00 0.00
> 16.65 0.00 0.49 0.03 0.00 17.45 0.79 98.48
> dm-0 982.80 2831.60 8.98 36.91 0.00 0.00 0.00
> 0.00 6.58 38.66 115.93 9.35 13.35 0.25 94.88
>
> 05/12/2021 01:46:02 PM
> avg-cpu: %user %nice %system %iowait %steal %idle
> 19.40 0.00 4.48 2.53 0.00 73.59
>
> Device r/s w/s rMB/s wMB/s rrqm/s wrqm/s %rrqm
> %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm %util
> loop0 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
> sdb 2.20 4.00 0.04 0.09 0.00 20.00 0.00
> 83.33 0.36 2.20 0.00 17.45 24.00 2.32 1.44
> sdc 915.00 2592.60 8.31 23.10 0.00 44.20 0.00
> 1.68 1.20 16.35 37.69 9.30 9.13 0.27 94.64
> sdd 0.00 1234.20 0.00 24.99 0.00 253.60 0.00
> 17.05 0.00 0.54 0.02 0.00 20.74 0.80 99.20
> dm-0 915.60 2636.80 8.31 23.10 0.00 0.00 0.00
> 0.00 1.20 16.19 43.80 9.30 8.97 0.27 94.64
>
> On Mon, May 24, 2021 at 9:13 AM Vijaykumar Jain <
> vijaykumarjain(dot)github(at)gmail(dot)com> wrote:
>
>> Hey Don,
>>
>> Was this resolved?
>> We're there any changes after moving wal to ssd ?
>>
>> On Wed, May 5, 2021, 10:14 PM Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
>> wrote:
>>
>>> On Wed, 2021-05-05 at 11:25 -0500, Don Seiler wrote:
>>> > What I want to know is if there are other events that would contribute
>>> > to a WALWriteLock (eg writing to a data file or the postgresql.log
>>> file)
>>> > or is it 100% writing to the WAL file.
>>>
>>> You don't contribute to a lock...
>>>
>>> The documentation says
>>> (
>>> https://www.postgresql.org/docs/13/monitoring-stats.html#WAIT-EVENT-LOCK-TABLE
>>> ):
>>>
>>> > WALWrite Waiting for WAL buffers to be written to disk.
>>>
>>> So it is contention while writing or flushing WAL.
>>> You can read the code for the details.
>>>
>>> Yours,
>>> Laurenz Albe
>>> --
>>> Cybertec | https://www.cybertec-postgresql.com
>>>
>>>
>
> --
> Don Seiler
> www.seiler.us
>

--
Thanks,
Vijay
Mumbai, India

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Don Seiler 2021-05-24 15:51:38 Re: WALWriteLocks
Previous Message Don Seiler 2021-05-24 15:05:30 Re: WALWriteLocks