Re: [Patch] ALTER SYSTEM READ ONLY

From: amul sul <sulamul(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [Patch] ALTER SYSTEM READ ONLY
Date: 2020-06-18 11:18:51
Message-ID: CAAJ_b95UPY6K4W0_=hRN1qoqG8EDYYSQnoq1tyhJZ_HM+xETLA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jun 18, 2020 at 3:25 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Wed, Jun 17, 2020 at 8:12 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> >
> > On Wed, Jun 17, 2020 at 9:02 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > Do we prohibit the checkpointer to write dirty pages and write a
> > > checkpoint record as well? If so, will the checkpointer process
> > > writes the current dirty pages and writes a checkpoint record or we
> > > skip that as well?
> >
> > I think the definition of this feature should be that you can't write
> > WAL. So, it's OK to write dirty pages in general, for example to allow
> > for buffer replacement so we can continue to run read-only queries.
> >
>
> For buffer replacement, many-a-times we have to also perform
> XLogFlush, what do we do for that? We can't proceed without doing
> that and erroring out from there means stopping read-only query from
> the user perspective.
>
Read-only does not restrict XLogFlush().

> > But there's no reason for the checkpointer to do it: it shouldn't try
> > to checkpoint, and therefore it shouldn't write dirty pages either.
> >
>
> What is the harm in doing the checkpoint before we put the system into
> READ ONLY state? The advantage is that we can at least reduce the
> recovery time if we allow writing checkpoint record.
>
The checkpoint could take longer, intending to quickly switch to the read-only
state.

> >
> > > What if vacuum is on an unlogged relation? Do we allow writes via
> > > vacuum to unlogged relation?
> >
> > Interesting question. I was thinking that we should probably teach the
> > autovacuum launcher to stop launching workers while the system is in a
> > READ ONLY state, but what about existing workers? Anything that
> > generates invalidation messages, acquires an XID, or writes WAL has to
> > be blocked in a read-only state; but I'm not sure to what extent the
> > first two of those things would be a problem for vacuuming an unlogged
> > table. I think you couldn't truncate it, at least, because that
> > acquires an XID.
> >
>
> If the truncate operation errors out, then won't the system will again
> trigger a new autovacuum worker for the same relation as we update
> stats at the end? Also, in general for regular tables, if there is an
> error while it tries to WAL, it could again trigger the autovacuum
> worker for the same relation. If this is true then unnecessarily it
> will generate a lot of dirty pages and don't think it will be good for
> the system to behave that way?
>
No new autovacuum worker will be forked in the read-only state and existing will
have an error if they try to write WAL after barrier absorption.

> > > > Another part of the patch that quite uneasy and need a discussion is that when the
> > > > shutdown in the read-only state we do skip shutdown checkpoint and at a restart, first
> > > > startup recovery will be performed and latter the read-only state will be restored to
> > > > prohibit further WAL write irrespective of recovery checkpoint succeed or not. The
> > > > concern is here if this startup recovery checkpoint wasn't ok, then it will never happen
> > > > even if it's later put back into read-write mode.
> > >
> > > I am not able to understand this problem. What do you mean by
> > > "recovery checkpoint succeed or not", do you add a try..catch and skip
> > > any error while performing recovery checkpoint?
> >
> > What I think should happen is that the end-of-recovery checkpoint
> > should be skipped, and then if the system is put back into read-write
> > mode later we should do it then.
> >
>
> But then if we have to perform recovery again, it will start from the
> previous checkpoint. I think we have to live with it.
>
Let me explain the case, if we do skip the end-of-recovery checkpoint while
starting the system in read-only mode and then later changing the state to
read-write and do a few write operations and online checkpoints, that will be
fine? I am yet to explore those things.

Regards,
Amul

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2020-06-18 11:57:13 Re: POC and rebased patch for CSN based snapshots
Previous Message Amit Kapila 2020-06-18 10:56:06 Re: [Patch] ALTER SYSTEM READ ONLY