| From: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> | 
|---|---|
| To: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com> | 
| Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> | 
| Subject: | Re: FSM corruption leading to errors | 
| Date: | 2016-10-17 11:04:39 | 
| Message-ID: | 486a13af-67a2-fb69-2240-26e150da9293@iki.fi | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
On 10/10/2016 05:25 PM, Michael Paquier wrote:
> On Fri, Oct 7, 2016 at 2:59 AM, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com> wrote:
>> I believe the fix is very simple. The FSM change during truncation is
>> critical and the buffer must be marked by MarkBufferDirty() i.e. those
>> changes must make to the disk. I think it's alright not to WAL log them
>> because XLOG_SMGR_TRUNCATE will redo() them if a crash occurs. But it must
>> not be lost across a checkpoint. Also, since it happens only during relation
>> truncation, I don't see any problem from performance perspective.
>
> Agreed. I happen to notice that VM is similalry careful when it comes
> to truncate it (visibilitymap_truncate).
visibilitymap_truncate is actually also wrong, in a different way. The 
truncation WAL record is written only after the VM (and FSM) are 
truncated. But visibilitymap_truncate() has already modified and dirtied 
the page. If the VM page change is flushed to disk before the WAL 
record, and you crash, you might have a torn VM page and a checksum failure.
Simply replacing the MarkBufferDirtyHint() call with MarkBufferDirty() 
in FreeSpaceMapTruncateRel would have the same issue. If you call 
MarkBufferDirty(), you must WAL-log the change, and also set the page's 
LSN to make sure the WAL record is flushed first.
I think we need something like the attached.
- Heikki
| Attachment | Content-Type | Size | 
|---|---|---|
| 0001-WIP-Fix-FSM-corruption-leading-to-errors.patch | text/x-patch | 12.5 KB | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Ashutosh Bapat | 2016-10-17 11:12:18 | Re: postgres_fdw : altering foreign table not invalidating prepare statement execution plan. | 
| Previous Message | Heikki Linnakangas | 2016-10-17 09:27:17 | Re: Password identifiers, protocol aging and SCRAM protocol |