Re: Request for further clarification on synchronous_commit

From: Kasper Kondzielski <kghost0(at)gmail(dot)com>
To: Bruce Momjian <bruce(at)momjian(dot)us>
Cc: pgsql-docs(at)lists(dot)postgresql(dot)org, jaroslaw(dot)kijanowski(at)softwaremill(dot)com
Subject: Re: Request for further clarification on synchronous_commit
Date: 2020-08-19 09:39:53
Message-ID: CAFv2VPQRT=8d2Q3ipTXZTaOdEY+taR8gBE77kL6dkk8gE09Xnw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs

> On Tue, Aug 18, 2020 at 12:50:34PM +0200, Kasper Kondzielski wrote:
> > Hi, thanks for the reply.
> >
> > To be honest I don't think it is better. Previously paragraph about
> > remote_apply was after paragraph about `on` and before remote_write
which
> > followed natural order in terms of how strict these parameters are
(i.e. how
> > strong are the guarantees they provide). Because of that I think that
> > remote_apply should return to its previous position.

> Uh, not really --- see below.

Ok, I see, thanks. Shouldn't we then stick to this order whenever possible
(might be sometimes reversed).
So, in the proposed patch I would suggest putting remote_apply first. (Of
course, before that we can mention that the default option is `on`, but
without going to much into the details.)

> Un, 'on' does _not_ apply the WAL data, and remote_apply does do remote
> fsync. If you want to go in order of severity, with the most severe
> first, it is:
>
> remote_apply
> on
> remote_write
> local

Wouldn't the table be beneficial when it comes to highlighting these
differences?

+-----------------------------+---------------------------------------------------------+
| | synchronous_commit
|
+-----------------------------+--------------+-------------------+--------------+-------+
| operation on standby server | remote_apply | on (remote_flush) |
remote_write | local |
+-----------------------------+--------------+-------------------+--------------+-------+
| write to WAL | Yes | Yes | Yes
| No |
+-----------------------------+--------------+-------------------+--------------+-------+
| fsync | Yes | Yes | No
| No |
+-----------------------------+--------------+-------------------+--------------+-------+
| apply WAL data | Yes | No | No
| No |
+-----------------------------+--------------+-------------------+--------------+-------+

> and this defines the 'on' behavior:
>
> /* Define the default setting for synchronous_commit */
> #define SYNCHRONOUS_COMMIT_ON SYNCHRONOUS_COMMIT_REMOTE_FLUSH

Is there any valid reason to hide this behavior under `on` alias? In my
opinion `remote_flush` does much better job with describing what it does.
Maybe we could rename `on` to `remote_flush` but also create an alias
`on=remote_flush` to keep backward compatibility?

+ Finally, when set to <literal>remote_apply</literal>, commits
+ will wait until replies from the current synchronous standby(s)
+ indicate they have received the commit record of the transaction
+ and applied it, so that it has become visible to queries on the
+ standby(s), and also written to durable storage on the standbys.

"and also written to durable storage on the standbys." -> You mean flushed?
Maybe it should be better to stick to cohesive terminology to not introduce
any confusion.

> Well, there is a doc section that talks about WAL:
>
> https://www.postgresql.org/docs/12/wal.html
>
> and other parts of the config docs that talk about WAL.

Yes, I know what is WAL for. I only don't get what kind of operation do you
mean by 'WAL replay'. The only one thing which I can think of is the
process of restoring database after a crash, when we apply changes from WAL
to the data pages which haven't been flushed to the disk, but I don't think
that this is that. Basically what I wonder is how can a WAL replay
influence the transaction commit?

wt., 18 sie 2020 o 19:17 Bruce Momjian <bruce(at)momjian(dot)us> napisał(a):

> On Tue, Aug 18, 2020 at 10:58:51AM -0400, Bruce Momjian wrote:
> > Un, 'on' does _not_ apply the WAL data, and remote_apply does do remote
> > fsync. If you want to go in order of severity, with the most severe
> > first, it is:
> >
> > remote_apply
> > on
> > remote_write
> > local
> >
> > This is seen in the C enum ordering for synchronous_commit, but in
> > reverse order:
> >
> > typedef enum
> > {
> > SYNCHRONOUS_COMMIT_OFF, /* asynchronous commit */
> > SYNCHRONOUS_COMMIT_LOCAL_FLUSH, /* wait for local flush only */
> > SYNCHRONOUS_COMMIT_REMOTE_WRITE, /* wait for local flush
> and remote
> > * write */
> > SYNCHRONOUS_COMMIT_REMOTE_FLUSH, /* wait for local and
> remote flush */
> > SYNCHRONOUS_COMMIT_REMOTE_APPLY /* wait for local flush and
> remote apply */
> > } SyncCommitLevel;
>
> Also, there is some logic to say that the postgresql.conf
> synchronous_commit options list should be reordered from:
>
> #synchronous_commit = on # synchronization level;
> # off, local,
> remote_write, remote_apply, or on
>
> to
>
> #synchronous_commit = on # synchronization level;
> # off, local,
> remote_write, on, or remote_apply
>
> I think we should backpatch the doc changes, but maybe not the
> postgresql.conf one --- I am not sure.
>
> --
> Bruce Momjian <bruce(at)momjian(dot)us> https://momjian.us
> EnterpriseDB https://enterprisedb.com
>
> The usefulness of a cup is in its emptiness, Bruce Lee
>
>

In response to

Responses

Browse pgsql-docs by date

  From Date Subject
Next Message PG Doc comments form 2020-08-21 09:25:20 Create a Foreign Table for PostgreSQL CSV Logs
Previous Message Bruce Momjian 2020-08-18 17:17:40 Re: Request for further clarification on synchronous_commit