Re: [HACKERS] Unlogged tables cleanup

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, konstantin knizhnik <k(dot)knizhnik(at)postgrespro(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Unlogged tables cleanup
Date: 2019-05-13 17:07:30
Message-ID: 20190513170730.GA25530@alvherre.pgsql
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2019-May-13, Andres Freund wrote:

> On 2019-05-13 12:24:05 -0400, Alvaro Herrera wrote:

> > AFAICS ResetUnloggedRelations copies the init fork after replaying WAL,
> > so it would be sufficient to have the init fork be recovered from WAL
> > for that to work. However, we also do ResetUnloggedRelations *before*
> > replaying WAL in order to remove leftover not-init-fork files, and that
> > process requires that the init fork is present at that time.
>
> What scenario are you precisely wondering about? That
> ResetUnloggedRelations() could overwrite the main fork, while not yet
> having valid contents (due to the lack of smgrimmedsync())? Shouldn't
> that only be possible while still in an inconsistent state? A checkpoint
> would have serialized the correct contents, and we'd not reach HS
> consistency before having replayed that WAL records resetting the table
> and the init fork consistency?

The first ResetUnloggedRelations call occurs before any WAL is replayed,
so the data dir certainly still in inconsistent state. At that point,
we need the init fork files to be present, because the init files are the
indicators of what relations we need to delete the other forks for.

Maybe we can do something lighter than a full immedsync of all the data
for the init file -- it would be sufficient to have the file *exist* --
but I'm not sure this optimization is worth anything.

> > So I think the immedsync call is necessary (otherwise the cleanup
> > may fail). I don't quite understand why the log_smgrcreate is
> > necessary, but I think it is for reasons that are not adequately
> > explained by the existing comments.
>
> Well, otherwise the relation won't exist on a standby? And if replay
> starts from before a database/tablespace creation we'd remove the init
> fork. So if it's not in the WAL, we'd loose it.

Ah, of course. Well, that needs to be in the comments then.

--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2019-05-13 17:09:17 Re: [HACKERS] Unlogged tables cleanup
Previous Message Andres Freund 2019-05-13 16:50:37 Re: [HACKERS] Unlogged tables cleanup