Re: [Patch] ALTER SYSTEM READ ONLY

From: Dilip Kumar <dilipbalaut(at)gmail(dot)com>
To: Amul Sul <sulamul(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Prabhat Sahu <prabhat(dot)sahu(at)enterprisedb(dot)com>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [Patch] ALTER SYSTEM READ ONLY
Date: 2021-05-11 10:42:53
Message-ID: CAFiTN-vTDVDx0z+_c-VbwoziAXyF7rztTwZ5hZ5RBj88tcMWpQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, May 11, 2021 at 3:38 PM Amul Sul <sulamul(at)gmail(dot)com> wrote:
>
> On Tue, May 11, 2021 at 2:26 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> >
> > On Tue, May 11, 2021 at 2:16 PM Amul Sul <sulamul(at)gmail(dot)com> wrote:
> >
> > > I get why you think that, I wasn't very precise in briefing the problem.
> > >
> > > Any new backend that gets connected right after the shared memory
> > > state changes to WALPROHIBIT_STATE_GOING_READ_WRITE will be by
> > > default allowed to do the WAL writes. Such backends can perform write
> > > operation before the checkpointer does the XLogAcceptWrites().
> >
> > Okay, make sense now. But my next question is why do we allow backends
> > to write WAL in WALPROHIBIT_STATE_GOING_READ_WRITE state? why don't we
> > wait until the shared memory state is changed to
> > WALPROHIBIT_STATE_READ_WRITE?
> >
>
> Ok, good question.
>
> Now let's first try to understand the Checkpointer's work.
>
> When Checkpointer sees the wal prohibited state is an in-progress state, then
> it first emits the global barrier and waits until all backers absorb that.
> After that it set the final requested WAL prohibit state.
>
> When other backends absorb those barriers then appropriate action is taken
> (e.g. abort the read-write transaction if moving to read-only) by them. Also,
> LocalXLogInsertAllowed flags get reset in it and that backend needs to call
> XLogInsertAllowed() to get the right value for it, which further decides WAL
> writes permitted or prohibited.
>
> Consider an example that the system is trying to change to read-write and for
> that wal prohibited state is set to WALPROHIBIT_STATE_GOING_READ_WRITE before
> Checkpointer starts its work. If we want to treat that system as read-only for
> the WALPROHIBIT_STATE_GOING_READ_WRITE state as well. Then we might need to
> think about the behavior of the backend that has absorbed the barrier and reset
> the LocalXLogInsertAllowed flag. That backend eventually going to call
> XLogInsertAllowed() to get the actual value for it and by seeing the current
> state as WALPROHIBIT_STATE_GOING_READ_WRITE, it will set LocalXLogInsertAllowed
> again same as it was before for the read-only state.

I might be missing something, but assume the behavior should be like this

1. If the state is getting changed from WALPROHIBIT_STATE_READ_WRITE
-> WALPROHIBIT_STATE_READ_ONLY, then as soon as the backend process
the barrier, we can immediately abort any read-write transaction(and
stop allowing WAL writing), because once we ensure that all session
has responded that now they have no read-write transaction then we can
safely change the state from WALPROHIBIT_STATE_GOING_READ_ONLY to
WALPROHIBIT_STATE_READ_ONLY.

2. OTOH, if we are changing from WALPROHIBIT_STATE_READ_ONLY ->
WALPROHIBIT_STATE_READ_WRITE, then we don't need to allow the backend
to consider the system as read-write, instead, we should wait until
the shared state is changed to WALPROHIBIT_STATE_READ_WRITE.

So your problem is that on receiving the barrier we need to call
LocalXLogInsertAllowed() from the backend, but how does that matter?
you can still make IsWALProhibited() return true.

I don't know the complete code so I might be missing something but at
least that is what I would expect from the design POV.

Other than this point, I think the state names READ_ONLY, READ_WRITE
are a bit confusing no? because actually, these states represent
whether WAL is allowed or not, but READ_ONLY, READ_WRITE seems like we
are putting the system under a Read-only state. For example, if you
are doing some write operation on an unlogged table will be allowed, I
guess because that will not generate the WAL until you commit (because
commit generates WAL) right? so practically, we are just blocking the
WAL, not the write operation.

--
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message tanghy.fnst@fujitsu.com 2021-05-11 10:48:16 RE: Remove "FROM" in "DELETE FROM" when using tab-completion
Previous Message Alexander Korotkov 2021-05-11 10:16:38 Re: PG 14 release notes, first draft