From: | Bruce Momjian <bruce(at)momjian(dot)us> |
---|---|
To: | Kasper Kondzielski <kghost0(at)gmail(dot)com> |
Cc: | pgsql-docs(at)lists(dot)postgresql(dot)org, jaroslaw(dot)kijanowski(at)softwaremill(dot)com |
Subject: | Re: Request for further clarification on synchronous_commit |
Date: | 2020-08-21 20:15:39 |
Message-ID: | 20200821201539.GA13363@momjian.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-docs |
On Wed, Aug 19, 2020 at 11:39:53AM +0200, Kasper Kondzielski wrote:
> > On Tue, Aug 18, 2020 at 12:50:34PM +0200, Kasper Kondzielski wrote:
> > > Hi, thanks for the reply.
> > >
> > > To be honest I don't think it is better. Previously paragraph about
> > > remote_apply was after paragraph about `on` and before remote_write which
> > > followed natural order in terms of how strict these parameters are (i.e.
> how
> > > strong are the guarantees they provide). Because of that I think that
> > > remote_apply should return to its previous position.
>
> > Uh, not really --- see below.
>
> Ok, I see, thanks. Shouldn't we then stick to this order whenever possible
> (might be sometimes reversed).
> So, in the proposed patch I would suggest putting remote_apply first. (Of
> course, before that we can mention that the default option is `on`, but without
> going to much into the details.)
Well, it is kind of confusing. I wanted to put remote_apply in its own
paragraph because it only applies to standbys, and because it is much
heavier and in a different scope (replay) than the others. Frankly,
remote_apply is realated to synchronicity only to the extent it allows
consistent/synchronous results from standbys, not related to syncing
data to the kernel or durable storage.
>
> > Un, 'on' does _not_ apply the WAL data, and remote_apply does do remote
> > fsync. If you want to go in order of severity, with the most severe
> > first, it is:
> >
> > remote_apply
> > on
> > remote_write
> > local
>
> Wouldn't the table be beneficial when it comes to highlighting these
> differences?
Uh, I don't think we list a table like this anywhere else for config
options. I would be interested if others think it would be helpful.
> +-----------------------------+---------------------------------------------------------+
> | | synchronous_commit |
> +-----------------------------+--------------+-------------------+--------------+-------+
> | operation on standby server | remote_apply | on (remote_flush) | remote_write | local |
> +-----------------------------+--------------+-------------------+--------------+-------+
> | write to WAL | Yes | Yes | Yes | No |
> +-----------------------------+--------------+-------------------+--------------+-------+
> | fsync | Yes | Yes | No | No |
> +-----------------------------+--------------+-------------------+--------------+-------+
> | apply WAL data | Yes | No | No | No |
> +-----------------------------+--------------+-------------------+--------------+-------+
>
>
> > and this defines the 'on' behavior:
> >
> > /* Define the default setting for synchronous_commit */
> > #define SYNCHRONOUS_COMMIT_ON SYNCHRONOUS_COMMIT_REMOTE_FLUSH
>
> Is there any valid reason to hide this behavior under `on` alias? In my opinion
> `remote_flush` does much better job with describing what it does. Maybe we
> could rename `on` to `remote_flush` but also create an alias `on=remote_flush`
> to keep backward compatibility?
Well, I think we originally only had 'on', and later added the others.
Also, 'on' is also local flush. We don't support local _write_ where we
only write it to the kernel. We support fysync off, which I think is
the local behavior of remote_write. I think remote_write is saying we
want local fsync but no fsync for remote. Is that even correct?
This is certainly confusing. Maybe we do need a chart, but we need to
list local and standby behavior.
> + Finally, when set to <literal>remote_apply</literal>, commits
> + will wait until replies from the current synchronous standby(s)
> + indicate they have received the commit record of the transaction
> + and applied it, so that it has become visible to queries on the
> + standby(s), and also written to durable storage on the standbys.
>
> "and also written to durable storage on the standbys." -> You mean flushed?
> Maybe it should be better to stick to cohesive terminology to not introduce any
> confusion.
Yes, I mean written to durable storage. I don't think you can use
"flushed" alone since you could be flusing the WAL buffers to the file
system.
> > Well, there is a doc section that talks about WAL:
> >
> > https://www.postgresql.org/docs/12/wal.html
> >
> > and other parts of the config docs that talk about WAL.
>
> Yes, I know what is WAL for. I only don't get what kind of operation do you
> mean by 'WAL replay'. The only one thing which I can think of is the process of
> restoring database after a crash, when we apply changes from WAL to the data
> pages which haven't been flushed to the disk, but I don't think that this is
> that. Basically what I wonder is how can a WAL replay influence the transaction
> commit?
Well, WAL reply is how replication works. Pretty much the same thing
that happens during crash recovery, but it happens continually.
Someone just wrote this blog entry, which I think helps explain what we
are talking about:
How is this for a table?
-- local -- ------------------- standbys ------------------
durable query durable commit durable commit
commit consistency after OS crash after PG crash
remote_apply X X X X
on X X X
remote_write X X
local X
off
--
Bruce Momjian <bruce(at)momjian(dot)us> https://momjian.us
EnterpriseDB https://enterprisedb.com
The usefulness of a cup is in its emptiness, Bruce Lee
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2020-08-21 21:58:07 | Re: Create a Foreign Table for PostgreSQL CSV Logs |
Previous Message | PG Doc comments form | 2020-08-21 09:25:20 | Create a Foreign Table for PostgreSQL CSV Logs |