Re: Cost of XLogInsert CRC calculations

From: Greg Stark <gsstark(at)mit(dot)edu>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, "Mark Cave-Ayland" <m(dot)cave-ayland(at)webbased(dot)co(dot)uk>, "'Manfred Koizar'" <mkoi-pg(at)aon(dot)at>, "'Bruce Momjian'" <pgman(at)candle(dot)pha(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Cost of XLogInsert CRC calculations
Date: 2005-05-31 16:02:12
Message-ID: 873bs3wfi3.fsf@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> writes:

> > Is the random WAL data really the concern? It seems like a more reliable way
> > of dealing with that would be to just accompany every WAL entry with a
> > sequential id and stop when the next id isn't the correct one.
>
> We do that, too (the xl_prev links and page header addresses serve that
> purpose). But it's not sufficient given that WAL records can span pages
> and therefore may be incompletely written.

Right, so the problem isn't that there may be stale data that would be
unrecognizable from real data. The problem is that the real data may be
partially there but not all there.

> > The only truly reliable way to handle this would require two fsyncs per
> > transaction commit which would be really unfortunate.
>
> How are two fsyncs going to be better than one?

Well you fsync the WAL entry and only when that's complete do you flip a bit
marking the WAL entry as commited and fsync again.

Hm, you might need three fsyncs, one to make sure the bit isn't set before
writing out the WAL record itself.

--
greg

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2005-05-31 16:27:18 Re: Cost of XLogInsert CRC calculations
Previous Message Tom Lane 2005-05-31 15:49:06 Re: ddl triggers