From: | Simon Riggs <simon(at)2ndQuadrant(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: pg_standby -l might destory the archived file |
Date: | 2009-06-04 08:53:05 |
Message-ID: | 1244105585.23910.153.camel@ebony.2ndQuadrant |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, 2009-06-02 at 14:54 -0400, Tom Lane wrote:
> Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
> > Tom Lane wrote:
> >> That's a good point; don't we recover files under names like
> >> RECOVERYXLOG, not under names that could possibly conflict with regular
> >> WAL files?
>
> > Yes. But we rename RECOVERYXLOG to 000000010000000000000057 or similar
> > at the end of recovery, in exitArchiveRecovery().
>
> > Thinking about this some more, I think we should've changed
> > exitArchiveRecovery() rather than RemoveOldXlogFiles(): it would be more
> > robust if exitArchiveRecovery() always copied the last WAL file rather
> > than just renamed it. It doesn't seem safe to rely on the file the
> > symlink points to to be valid after recovery is finished, and we might
> > write to it before it's recycled, so the current fix isn't complete.
>
> Hmm. I think really the reason it's coded that way is that we assumed
> the recovery command would be physically copying the file from someplace
> else. pg_standby is violating the backend's expectations by using a
> symlink. And I really doubt that the technique is saving anything, since
> the data has to be read in from the archive location anyway.
>
> I'm leaning back to the position that pg_standby's -l option is simply a
> bad idea and should be removed.
ISTM we didn't clearly state what the recovery_command should do either
way. Even if you remove the pg_standby option that will not fix the
problem for people who have written their own script, or existing users
of pg_standby. The safe way is to do as Heikki suggests and copy the
final file into place and I would add that we must then fsync it also.
That should be back-patched possibly as far as 8.0. Documenting a change
is not nearly enough.
Removing -l is a separate discussion. If there was a consensus against
it, I would suggest that we deprecate the option, so that it does
nothing.
As an aside, I would be also much more comfortable if there was an
option to not recycle the WAL files at all, as a safe mode for error
checking at least. The question is: why do we need to zero fill the file
first anyway? We could save the 8 bytes for the prev pointer on every
record if we added enough zeroes onto every WAL write to zap any
pre-existing header data, causing recovery to fail.
--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support
From | Date | Subject | |
---|---|---|---|
Next Message | Simon Riggs | 2009-06-04 09:01:28 | Re: pg_standby -l might destory the archived file |
Previous Message | Simon Riggs | 2009-06-04 08:38:12 | Re: [COMMITTERS] pgsql: Fix LOCK TABLE to eliminate the race condition that could make it |