Unlogged tables can vanish after a crash

From: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Unlogged tables can vanish after a crash
Date: 2014-11-19 11:26:56
Message-ID: A737B7A37273E048B164557ADEF4A58B17D9FC1B@ntex2010a.host.magwien.gv.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I observed an interesting (and I think buggy) behaviour today after one of
our clusters crashed due to an "out of space" condition in the data directory.

Five databases in that cluster have each one unlogged table.

The log reads as follows:

PANIC could not write to file "pg_xlog/xlogtemp.1820": No space left on device
...
LOG terminating any other active server processes
...
LOG all server processes terminated; reinitializing
LOG database system was interrupted; last known up at 2014-11-18 18:04:28 CET
LOG database system was not properly shut down; automatic recovery in progress
LOG redo starts at C9/50403B20
LOG redo done at C9/5AFFFF98
LOG checkpoint starting: end-of-recovery immediate
LOG checkpoint complete: ...
LOG autovacuum launcher started
LOG database system is ready to accept connections
...
PANIC could not write to file "pg_xlog/xlogtemp.4417": No space left on device
...
LOG terminating any other active server processes
...
LOG all server processes terminated; reinitializing
LOG database system was interrupted; last known up at 2014-11-18 18:04:38 CET
LOG database system was not properly shut down; automatic recovery in progress
LOG redo starts at C9/5B000070
LOG redo done at C9/5FFFE4E0
LOG checkpoint starting: end-of-recovery immediate
LOG checkpoint complete: ...
FATAL could not write to file "pg_xlog/xlogtemp.4442": No space left on device
LOG startup process (PID 4442) exited with exit code 1
LOG aborting startup due to startup process failure

After the problem was removed, the cluster was restarted.
The log reads as follows:

LOG ending log output to stderr Future log output will go to log destination "csvlog".
LOG database system was shut down at 2014-11-18 18:05:03 CET
LOG autovacuum launcher started
LOG database system is ready to accept connections

So no crash recovery was performed, probably because the startup process
failed *after* it completed the end-of-recovery checkpoint.

Now the main fork files for all five unlogged tables are gone; the init fork files
are still there.

Obviously the main fork got nuked during recovery, but the startup process died
before it could recreate them:

/*
* Preallocate additional log files, if wanted.
*/
PreallocXlogFiles(EndOfLog);

/*
* Reset initial contents of unlogged relations. This has to be done
* AFTER recovery is complete so that any unlogged relations created
* during recovery also get picked up.
*/
if (InRecovery)
ResetUnloggedRelations(UNLOGGED_RELATION_INIT);

It seems to me that the right fix would be to recreate the unlogged
relations *before* the checkpoint.

Yours,
Laurenz Albe

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2014-11-19 11:35:06 Re: proposal: plpgsql - Assert statement
Previous Message Simon Riggs 2014-11-19 11:20:09 Re: tracking commit timestamps