From: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
---|---|
To: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Creation of an empty table is not fsync'd at checkpoint |
Date: | 2022-01-27 17:55:45 |
Message-ID: | d47d8122-415e-425c-d0a2-e0160829702d@iki.fi |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
If you create an empty table, it is not fsync'd. As soon as you insert a
row to it, register_dirty_segment() gets called, and after that, the
next checkpoint will fsync it. But before that, the creation itself is
never fsync'd. That's obviously not great.
The lack of an fsync is a bit hard to prove because it requires a
hardware failure, or a simulation of it, and can be affected by
filesystem options too. But I was able to demonstrate a problem with
these steps:
1. Create a VM with two virtual disks. Use ext4, with 'data=writeback'
option (I'm not sure if that's required). Install PostgreSQL on one of
the virtual disks.
2. Start the server, and create a tablespace on the other disk:
CREATE TABLESPACE foospc LOCATION '/data/heikki';
3. Do this:
CREATE TABLE foo (i int) TABLESPACE foospc;
CHECKPOINT;
4. Immediately after that, kill the VM. I used:
killall -9 qemu-system-x86_64
5. Restart the VM, restart PostgreSQL. Now when you try to use the
table, you get an error:
postgres=# select * from crashtest ;
ERROR: could not open file "pg_tblspc/81921/PG_15_202201271/5/98304":
No such file or directory
I was not able to reproduce this without the tablespace on a different
virtual disk, I presume because ext4 orders the writes so that the
checkpoint implicitly always flushes the creation of the file to disk. I
tried data=writeback but it didn't make a difference. But with a
separate disk, it happens every time.
I think the simplest fix is to call register_dirty_segment() from
mdcreate(). As in the attached. Thoughts?
- Heikki
Attachment | Content-Type | Size |
---|---|---|
0001-Ensure-that-creation-of-an-empty-relfile-is-fsync-d-.patch | text/x-patch | 1.1 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Nathan Bossart | 2022-01-27 18:18:15 | Re: make MaxBackends available in _PG_init |
Previous Message | Robert Haas | 2022-01-27 17:42:39 | Re: refactoring basebackup.c |