Quick Links

Re: Theory about XLogFlush startup failures

From:	Hiroshi Inoue <Inoue(at)tpf(dot)co(dot)jp>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Vadim Mikheev <vmikheev(at)sectorbase(dot)com>, pgsql-hackers(at)postgreSQL(dot)org
Subject:	Re: Theory about XLogFlush startup failures
Date:	2002-01-15 02:23:44
Message-ID:	3C4392B0.637CF161@tpf.co.jp
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Tom Lane wrote:
>
> I just spent some time trying to understand the mechanism behind the
> "XLogFlush: request is not satisfied" startup errors we've seen reported
> occasionally with 7.1. The only apparent way for this to happen is for
> XLogFlush to be given a garbage WAL record pointer (ie, one pointing
> beyond the current end of WAL), which presumably must be coming from
> a corrupted LSN field in a data page. Well, that's not too hard to
> believe during normal operation: say the disk drive drops some bits in
> the LSN field, and we read the page in, and don't have any immediate
> need to change it (which would cause the LSN to be overwritten); but we
> do find some transaction status hint bits to set, so the page gets
> marked dirty. Then when the page is written out, bufmgr will try to
> flush xlog using the corrupted LSN pointer.

I agree with you at least at the point that we had better
continue FlushBufferPool() even though STOP-error occurs.

BTW doesn't the LSN corruption imply the possibility
of the corruption of other parts (of e.g. pg_log) ?

regards,
Hiroshi Inoue

In response to

Theory about XLogFlush startup failures at 2002-01-12 20:46:30 from Tom Lane

Responses

Re: Theory about XLogFlush startup failures at 2002-01-15 02:49:49 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Brent Verner	2002-01-15 02:30:38	Re: Problem reloading regression database
Previous Message	Tatsuo Ishii	2002-01-15 00:59:16	Re: unicode words