From: | Justin Clift <aa2(at)bigpond(dot)net(dot)au> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Ned Lilly <ned(at)greatbridge(dot)com> |
Subject: | Re: WAL's single point of failure: latest CHECKPOINT record |
Date: | 2001-03-01 23:56:58 |
Message-ID: | 3A9EE1CA.7A7282EF@bigpond.net.au |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi all,
Out of curiosity, does anyone know of any projects that are presently
creating PostgreSQL database recovery tools?
For example database corruption recovery, Point In Time restoration, and
such things?
It might be a good project for GreatBridge to look into if no-one else
is doing it already.
Regards and best wishes,
Justin Clift
Database Administrator
Tom Lane wrote:
>
> As the WAL stuff is currently constructed, the system will refuse to
> start up unless the checkPoint field of pg_control points at a valid
> checkpoint record in the WAL log.
>
> Now I know we write and fsync the checkpoint record before we rewrite
> pg_control, but this still leaves me feeling mighty uncomfortable.
> See past discussions about how fsync order doesn't necessarily mean
> anything if the disk drive chooses to reorder writes. Since loss of
> the checkpoint record means complete loss of the database, I think we
> need to work harder here.
>
> What I'm thinking is that pg_control should have pointers to the last
> two checkpoint records, not only the last one. If we fail to read the
> most recent checkpoint, try the one before it (which, obviously, means
> we must keep the log files long enough that we still have that one too).
> We can run forward from there and redo the intervening WAL records the
> same as we would do anyway.
>
> This would mean an initdb to change the format of pg_control. However
> I already have a couple other reasons in favor of an initdb: the
> record-length bug I mentioned yesterday, and the bogus CRC algorithm.
> I'm not finished reviewing the WAL code, either :-(
>
> regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2001-03-02 00:22:56 | Re: WAL's single point of failure: latest CHECKPOINT record |
Previous Message | Oliver Elphick | 2001-03-01 22:01:54 | Empty queries in src/test/bench |