From: | Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us> |
---|---|
To: | jrnield(at)usol(dot)com |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org> |
Subject: | Re: Issues Outstanding for Point In Time Recovery (PITR) |
Date: | 2002-07-06 02:08:54 |
Message-ID: | 200207060208.g6628sZ11083@candle.pha.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Bruce Momjian wrote:
> You are saying, "How do we know what WAL records go with that backup
> snapshot of the file?" OK, lets assume we are shutdown. You can grab
> the WAL log info from pg_control using contrib/pg_controldata and that
> tells you what WAL logs to roll forward when you need to PIT recover
> that backup later. If you store that info in the first file you backup,
> you can have that WAL pointer available for later recovery in case you
> are restoring from that backup. Is that the issue?
>
> What seems more complicated is doing the backup while the database is
> active, and this may be a requirement for a final PITR solution. Some
> think we can grab the WAL pointer at 'tar' start and replay that on the
> backup even if the file changes during backup.
OK, I think I understand live backups now using tar and PITR. Someone
explained this to me months ago but now I understand it.
First, a key issue is that PostgreSQL doesn't fiddle with individual
items on disk. It reads an 8k block, modifies it, (writes it to WAL if
it hasn't been written to that WAL segment before), and writes it to
disk. That is key. (Are there cases where don't do this, like
pg_controldata?)
OK, so you do a tar backup of a file. While you are doing the tar,
certain 8k blocks are being modified in the file. There is no way to
know what blocks are modified as you are doing the tar, and in fact you
could read partial page writes during the tar.
One solution would be to read the file using the PostgreSQL page buffer,
but even then, getting a stable snapshot of the file would be difficult.
Now, we could lock the table and prevent writes while it is being backed
up, but there is a better way.
We already have pre-change page images in WAL. When we do the backup,
any page that was modified while we were backing up is in the WAL. On
restore, we can recover whatever tar saw of the file, knowing that the
WAL page images will recover any page changes made during the tar.
Now, you mentioned we may not want pre-change page images in WAL
because, with PITR, we can more easily recover from the WAL rather than
having this performance hit for many page writes.
What I suggest is a way for the backup tar to turn on pre-change page
images while the tar is happening, and turn it off after the tar is
done.
We already have this TODO item:
* Turn off after-change writes if fsync is disabled (?)
No sense in doing after-change WAL writes without fsync. We could
extend this so those after-changes writes could be turned on an off,
allowing fill tar backups and PITR recovery. In fact, for people with
reliable hardware, we should already be giving them the option of
turning off pre-change writes. We don't have a way of detecting partial
page writes, but then again, we can't detect failures with fsync off
anyway so it seems to be the same vulnerability. I guess that's why we
were going to wrap the effect into the same variable, but for PITR, can
see wanting fsync always on and the ability to turn pre-change writes on
and off.
--
Bruce Momjian | http://candle.pha.pa.us
pgman(at)candle(dot)pha(dot)pa(dot)us | (610) 853-3000
+ If your life is a hard drive, | 830 Blythe Avenue
+ Christ can be your backup. | Drexel Hill, Pennsylvania 19026
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2002-07-06 02:13:45 | Re: I am being interviewed by OReilly |
Previous Message | Tatsuo Ishii | 2002-07-06 00:53:32 | Re: Proposal: CREATE CONVERSION |