Re: On markers of changed data

From: Greg Stark <stark(at)mit(dot)edu>
To: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: On markers of changed data
Date: 2017-10-10 20:43:51
Message-ID: CAM-w4HME8Kx3saN27CZn-zPw2DWsmc8osXR=8PuJJ0Q2rKU6BQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 8 October 2017 at 08:52, Andrey Borodin <x4mmm(at)yandex-team(dot)ru> wrote:
>
> 1. Any other marker would be better (It can be WAL scan during archiving, some new LSN-based mechanics* et c.)

The general shape of what I would like to see is some log which lists
where each checkpoint starts and ends and what blocks are modified
since the previous checkpoint. Then to generate an incremental backup
from any point in time to the current you union all the block lists
between them and fetch those blocks. There are other ways of using
this aside from incremental backups on disk too -- you could imagine a
replica that has fallen behind requesting the block lists and then
fetching just those blocks instead of needing to receive and apply all
the wal. Or possibly even making a cost-based decision between the two
depending on which would be faster.

It would also be useful for going in the reverse direction: look up
all the records (or just the last record) that modified a given block.
Instead of having to scan all the wal you would only need to scan the
checkpoint eras that had touched that block.

--
greg

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2017-10-10 20:45:37 Re: Extended statistics is not working on Vars hidden under a RelabelType
Previous Message Petr Jelinek 2017-10-10 20:34:04 Re: Help required to debug pg_repack breaking logical replication