RE: Disable WAL logging to speed up data loading

From: "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>
To: 'Masahiko Sawada' <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Cc: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>, "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "ashutosh(dot)bapat(dot)oss(at)gmail(dot)com" <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: RE: Disable WAL logging to speed up data loading
Date: 2020-11-02 00:27:14
Message-ID: OSBPR01MB488846258B74EFA1C4280168ED100@OSBPR01MB4888.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello

On Friday, October 30, 2020 1:32 PM Masahiko Sawada wrote:
> > On 2020/10/29 19:21, Laurenz Albe wrote:
> > > On Thu, 2020-10-29 at 11:42 +0900, Fujii Masao wrote:
> > >>> But what if someone sets wal_level=none, performs some data
> > >>> modifications, sets wal_level=archive and after dome more
> > >>> processing decides to restore from a backup that was taken before
> the cluster was set to wal_level=none?
> > >>> Then they would end up with a corrupted database, right?
> > >>
> > >> I think that the guard to prevent the server from starting up from
> > >> the corrupted database in that senario is necessary.
> > >
> > > That wouldn't apply, I think, because the backup from which you
> > > start was taken with wal_level = replica, so the guard wouldn't alert.
> > >
> > > But as I said in the other thread, changing wal_level emits a WAL
> > > record, and I am sure that recovery will refuse to proceed if
> > > wal_level < replica.
> >
> > Yes. What I meant was such a safe guard needs to be implemented.
> >
> > This may mean that if we want to recover the database from that
> > backup, we need to specify the recovery target so that the archive
> > recovery stops just before the WAL record indicating wal_level change.
>
> Yeah, it also means that setting wal_level to none makes the previous backup
> no use even if the user has some generations of backup.
>
> Does it make things simple if the usage of wal_level = 'none' is limited to
> initial data loading for example? I mean we add a special flag to initdb that
> sets wal_level to 'none' after initialization and the user does initial data
> loading and set wal_level to >= minimal.
> That is, we allow users to set from none to >= minimal but not for the reverse.
> Since we prevent the database cluster from backup when wal_level is none,
> the recovery never across wal_level = none. Not sure this idea can address
> the case Osumi-san concerned though.
Thanks you so much for the discussion.

Hmm, this was a good idea to implement
some kind of valve to prevent backflow of wal_levels.
But, ideally I'd like to allow users to switch wal_levels from/to 'none',
in order to let them operate bulk load in a faster way by this feature
even if that operation isn't for the initial data loading.

Regards,
Takamichi Osumi

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2020-11-02 00:49:47 Re: Dereference before NULL check (src/backend/storage/ipc/latch.c)
Previous Message Tom Lane 2020-11-01 23:15:20 Re: how to replicate test results in cf-bot on travis