From: | Ashwin Agrawal <aagrawal(at)pivotal(dot)io> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: finding changed blocks using WAL scanning |
Date: | 2019-04-11 17:00:35 |
Message-ID: | CALfoeis0qOyGk+KQ3AbkfRVv=XbsSecqHfKSag=i_SLWMT+B0A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Apr 11, 2019 at 6:27 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Thu, Apr 11, 2019 at 3:52 AM Peter Eisentraut
> <peter(dot)eisentraut(at)2ndquadrant(dot)com> wrote:
> > I had in mind that you could have different overlapping incremental
> > backup jobs in existence at the same time. Maybe a daily one to a
> > nearby disk and a weekly one to a faraway cloud. Each one of these
> > would need a separate replication slot, so that the information that is
> > required for *that* incremental backup series is preserved between runs.
> > So just one reserved replication slot that feeds the block summaries
> > wouldn't work. Perhaps what would work is a flag on the replication
> > slot itself "keep block summaries for this slot". Then when all the
> > slots with the block summary flag are past an LSN, you can clean up the
> > summaries before that LSN.
>
> I don't think that quite works. There are two different LSNs. One is
> the LSN of the oldest WAL archive that we need to keep around so that
> it can be summarized, and the other is the LSN of the oldest summary
> we need to keep around so it can be used for incremental backup
> purposes. You can't keep both of those LSNs in the same slot.
> Furthermore, the LSN stored in the slot is defined as the amount of
> WAL we need to keep, not the amount of something else (summaries) that
> we need to keep. Reusing that same field to mean something different
> sounds inadvisable.
>
> In other words, I think there are two problems which we need to
> clearly separate: one is retaining WAL so we can generate summaries,
> and the other is retaining summaries so we can generate incremental
> backups. Even if we solve the second problem by using some kind of
> replication slot, we still need to solve the first problem somehow.
>
Just a thought for first problem, may not to simpler, can replication slot
be enhanced to define X amount of WAL to retain, after reaching such limit
collect summary and let the WAL be deleted.
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Geoghegan | 2019-04-11 17:02:02 | Re: Reducing the runtime of the core regression tests |
Previous Message | Andres Freund | 2019-04-11 16:58:12 | Re: Enable data checksums by default |