Re: asynchronous commit risk window is overly optimistic

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: pgsql-docs(at)postgresql(dot)org
Subject: Re: asynchronous commit risk window is overly optimistic
Date: 2019-04-09 21:44:12
Message-ID: 20190409214412.pew7tgfzn2y7k334@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs

On Wed, Mar 20, 2019 at 02:50:21PM -0400, Jeff Janes wrote:
> https://www.postgresql.org/docs/current/wal-async-commit.html:
>
> "If the database crashes during the risk window between an asynchronous commit
> and the writing of the transaction's WAL records, then changes made during that
> transaction will be lost. The duration of the risk window is limited because a
> background process (the “WAL writer”) flushes unwritten WAL records to disk
> every wal_writer_delay milliseconds. The actual maximum duration of the risk
> window is three times wal_writer_delay because the WAL writer is designed to
> favor writing whole pages at a time during busy periods."
>
> I think the phrase "actual maximum duration" here is far too reassuring. There
> is no guarantee that the kernel will wake WAL writer three times in a row at
> the times it requested, or even any other smalish multiple of that time. Even
> if the wal_writer does repeatedly wake on schedule and requests a fsync, that
> fsync itself can take a very large multiple of wal_writer_delay milliseconds
> before it takes effect.
>
> If your server experiences a sudden power failure during normal operations with
> uncongested IO, then it is very likely that anything asynchronously committed
> more than three wal_writer_delay (plus two disk rotations) ago has made it to
> disk.  But if it crashes for some other reason than a sudden power failure, it
> is less likely to be on disk.  A stricken server can go wobbly for a long time
> before finally falling over.
>
> Maybe it should be replaced with something less confident, like "Under normal
> conditions, the flush will be initiated within three times wal_writer_delay
> because the WAL writer is designed to favor writing whole pages at a time
> during busy periods."
>
> Although the whole "because" clause seems to be more inside baseball than is
> warranted here.

I think we can go with:

"Under normal conditions, the flush will be initiated within
roughly three times wal_writer_delay".

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +

In response to

Browse pgsql-docs by date

  From Date Subject
Next Message David G. Johnston 2019-04-09 22:59:22 Re: Section 4.1.2.7 contains false information
Previous Message Jind?ich Vavru?ka 2019-04-09 20:12:07 RE: Section 4.1.2.7 contains false information