Re: "Resurrected" data files - problem?

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc: Tom Lane *EXTERN* <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Peter Childs <peterachilds(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: "Resurrected" data files - problem?
Date: 2007-11-08 21:45:46
Message-ID: 1194558346.4251.320.camel@ebony.site
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, 2007-11-08 at 17:11 +0100, Albe Laurenz wrote:
> Tom Lane wrote:
> >>> So if we perform our database backups with incremental
> >>> backups as described above, we could end up with additional
> >>> files after the restore, because PostgreSQL files can get
> >>> deleted (e.g. during DROP TABLE or TRUNCATE TABLE).
> >>>
> >>> Could such "resurrected" files (data files, files in
> >>> pg_xlog, pg_clog or elsewhere) cause a problem for the database
> >>> (other than the obvious one that there may be unnecessary files
> >>> about that consume disk space)?
> >>
> >> This will not work at all.
> >
> > To be more specific: the resurrected files aren't the problem;
> > offhand I see no reason they'd create any issue beyond wasted
> > disk space. The problem is version skew between files that were
> > backed up at slightly different times, leading to inconsistency.
>
> I should have mentioned that before the (incremental) backup
> there would be a pg_start_backup() and a pg_stop_backup()
> afterwards, and we would use PITR.
>
> So there could only be three kinds of files:
> - Files that did not change since the full backup, restored
> from there. They should therefore look exactly as if the
> online backup were performed in the normal way.
> - Files that have changed or are new, restored from the
> incremental backup. These will also be ok, because
> they were backed up between pg_start_backup() and
> pg_stop_backup().
> - Files that have been deleted between full and incremental
> backup and have been resurrected.
>
> This third group is the only one which might be problematic,
> as far as I can see, because PostgreSQL will no expect them to
> be there.
>
> The version skew between files backed up at slightly different
> times should be taken care of by PITR, shouldn't it?

The backup is not instantaneous, so there is no single time at which the
hot backup takes place. So deciding whether a file has changed based
upon a comparison of two file timestamps cannot work. You would need to
take timestamps for the file both before the pg_start_backup() and after
the pg_stop_backup() of the file for both full and incremental backups.
If all four timestamps are equivalent, then you are safe.

The relfilenode ids are potentially reused after a period of time, so
that could cause errors if not catered for on the incremental restore.

--
Simon Riggs
2ndQuadrant http://www.2ndQuadrant.com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Christian Schröder 2007-11-08 23:02:11 Re: (Never?) Kill Postmaster?
Previous Message Thomas H. 2007-11-08 21:00:07 Re: subselect field "problem"