Quick Links

'Invalid lp' during heap_xlog_delete

From:	Daniel Wood <hexexpert(at)comcast(dot)net>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	'Invalid lp' during heap_xlog_delete
Date:	2019-11-08 20:46:51
Message-ID:	822113470.250068.1573246011818@connect.xfinity.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Page on disk has empty lp 1
* Insert into page lp 1

checkpoint START. Redo eventually starts here.
** Delete all rows on page.
autovac truncate
DropRelFileNodeBuffers - dirty page NOT written. lp 1 on disk still empty
checkpoint completes
crash
smgrtruncate - Not reached

heap_xlog_delete reads page with empty lp 1 and the delete fails.

The checkpoint can not have yet written * or ** before DropRelFileNodeBuffers invalidates either of those dirty page versions for this to repro.

Even if we reach the truncate we don't fsync it till the next checkpoint. So on filesystems which delay metadata updates a crash can lose the truncate.

Once we do the fsync(), for the truncate, the REDO read will return BLK_NOTFOUND and the DELETE REDO attempt will be skipped.
WIthout the fsync() or crashing before the truncate, the delete redo depends on the most recent version of the page having been written by the checkpoint.

Found during stress test and verified with pg_usleep's to test hypothesis.

Is DropRelFileNodeBuffers purely for performance or would there be any correctness problems if not done.

Responses

Re: 'Invalid lp' during heap_xlog_delete at 2019-11-09 01:39:37 from Michael Paquier
Re: 'Invalid lp' during heap_xlog_delete at 2019-12-06 23:06:40 from Andres Freund

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tomas Vondra	2019-11-08 21:11:16	Re: Monitoring disk space from within the server
Previous Message	Andres Freund	2019-11-08 20:10:35	Re: heapam_index_build_range_scan's anyvisible