Quick Links

Re: data corruption hazard in reorderbuffer.c

From:	Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To:	Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: data corruption hazard in reorderbuffer.c
Date:	2021-07-15 22:32:07
Message-ID:	d8302df8-e37b-a518-1e95-925eeb086af5@enterprisedb.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

I think it's mostly futile to list all the possible issues this might
have caused - if you skip arbitrary decoded changes, that can trigger
pretty much any bug in reorder buffer. But those bugs can be triggered
by various other issues, of course.

It's hard to say what was the cause, but the "logic" bugs are probably
permanent, while the issues triggered by I/O probably disappear after a
restart?

That being said, I agree this seems like an issue and we should not
ignore I/O errors. I'd bet other places using transient files (like
sorting or hashagg spilling to disk) has the same issue, although in
that case the impact is likely limited to a single query.

I wonder if sync before the close is an appropriate solution, though. It
seems rather expensive, and those files are meant to be "temporary"
(i.e. we don't keep them over restart). So maybe we could ensure the
consistency is a cheaper way - perhaps tracking some sort of checksum
for each file, or something like that?

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Re: data corruption hazard in reorderbuffer.c at 2021-07-15 21:17:32 from Mark Dilger

Responses

Re: data corruption hazard in reorderbuffer.c at 2021-07-15 23:07:46 from Mark Dilger

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Mark Dilger	2021-07-15 23:07:46	Re: data corruption hazard in reorderbuffer.c
Previous Message	John Naylor	2021-07-15 22:00:05	Re: speed up verifying UTF-8