xlog loose ends, continued

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Mikheev, Vadim" <vmikheev(at)SECTORBASE(dot)COM>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: xlog loose ends, continued
Date: 2001-03-13 03:50:20
Message-ID: 17821.984455420@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

There is another loose end that I forgot I needed to discuss with you.

xlog.c's ReadRecord formerly contained code that would zero out the rest
of the log segment (and delete the next log segment, if any) upon
detecting a missing or corrupted xlog record. I removed that code
because I considered it horribly dangerous where it was. If there is
anything wrong with either the xlog or pg_control's pointers to it,
that code was quite capable of wiping out all hope of recovery *and*
all evidence of what went wrong.

I think it's really bad to automatically destroy log data, especially
when we do not yet know if we are capable of recovering. If we need
this functionality, it should be invoked only at the completion of
StartupXLOG, after we have finished the recovery phase. However,
I'd be a lot happier if we could avoid wholesale zeroing at all.

I presume the point of this code was that if we recover and then suffer
a later crash at a point where we've just written an xlog record that
exactly fills an xlog page, a subsequent scan of the log might continue
on from that point and pick up xlog records from the prior (failed)
system run. Is there a way to guard against that scenario without
having to zero out data during recovery?

One thought that comes to mind is to store StartUpID in XLOG page
headers, and abort log scanning if we come to a page with StartUpID
less than what came before. Is that secure/sufficient? Is there
a better way?

regards, tom lane

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2001-03-13 04:11:01 xlog checkpoint depends on sync() ... seems unsafe
Previous Message Ryan Kirkpatrick 2001-03-13 03:39:54 Re: Vaccuum Failure w/7.1beta4 on Linux/Sparc