Re: Command to prune archive at restartpoints

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Greg Stark <gsstark(at)mit(dot)edu>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Command to prune archive at restartpoints
Date: 2010-03-17 10:14:40
Message-ID: 4BA0AB90.5060208@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greg Stark wrote:
> On Wed, Mar 17, 2010 at 9:37 AM, Heikki Linnakangas
> <heikki(dot)linnakangas(at)enterprisedb(dot)com> wrote:
>> One awkward omission in the new built-in standby mode, mainly used for
>> streaming replication, is that there is no easy way to delete old
>> archived files like you do with the %r parameter to restore_command.
>
> I'm still finding this kind of narrow-minded. I'm picturing a system
> with multiple replicas -- obvious no one replica can take it upon
> itself to delete archived log files based only on its own
> restartpoint. And besides, if you're using the archived log files for
> backups you also need to take into account the backup policy and only
> delete files that aren't needed for a consistent backup and aren't
> needed for the replica.

That's why we provide options that take any shell command you want,
rather than e.g a path to an archive directory that's pruned automatically.

For example, if you have multiple standbys sharing one archive, you
could do something like this:

In each standby, have a restartpoint_command along the lines of:
"echo %r > <archivedirectory>/standby1_location; archive_cleanup.sh"

Where '1' is different for every standby

and in archive_cleanup.sh, scan through all the standbyX_location files,
take the minimum, and delete all files smaller than that.

You'll need some care with locking etc., but the point is that the
current hooks allow you to implement complex setups like that.

> What we need is a program which can take all this information from all
> your slaves and backup labels into account and implement your backup
> policies. It probably won't exist in time for the release and in any
> case doesn't really have to ship with Postgres. There might even be
> more than one.

I guess I just described such a program :-). Yeah, I'd imagine that to
become part of toolkits like skytools.

> But do we have all the information that such a program would need? Is
> there a way to connect to a replica and ask it what the restart point
> is?

Hmm, Greg Smith opened a thread on exposing the fields in the control
file as user-defined functions. IIRC last restartpoint location was the
piece of information that triggered the discussion this time. Perhaps we
should indeed add a function to expose that in 9.0.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2010-03-17 10:35:37 Re: Re: [COMMITTERS] pgsql: Make standby server continuously retry restoring the next WAL
Previous Message Greg Stark 2010-03-17 10:01:44 Re: Command to prune archive at restartpoints