Re: finding changed blocks using WAL scanning

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: finding changed blocks using WAL scanning
Date: 2019-04-20 00:39:51
Message-ID: 20190420003951.GQ6197@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Greetings,

* Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> On Mon, Apr 15, 2019 at 11:45 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> > Any caller of XLogWrite() could switch to a new segment once the
> > current one is done, and I am not sure that we would want some random
> > backend to potentially slow down to do that kind of operation.
> >
> > Or would a separate background worker do this work by itself? An
> > external tool can do that easily already:
> > https://github.com/michaelpq/pg_plugins/tree/master/pg_wal_blocks
>
> I was thinking that a dedicated background worker would be a good
> option, but Stephen Frost seems concerned (over on the other thread)
> about how much load that would generate. That never really occurred
> to me as a serious issue and I suspect for many people it wouldn't be,
> but there might be some.

While I do think we should at least be thinking about the load caused
from scanning the WAL to generate a list of blocks that are changed, the
load I was more concerned with in the other thread is the effort
required to actually merge all of those changes together over a large
amount of WAL. I'm also not saying that we couldn't have either of
those pieces done as a background worker, just that it'd be really nice
to have an external tool (or library) that can be used on an independent
system to do that work.

> It's cool that you have a command-line tool that does this as well.
> Over there, it was also discussed that we might want to have both a
> command-line tool and a background worker. I think, though, that we
> would want to get the output in some kind of compressed binary format,
> rather than text. e.g.
>
> 4-byte database OID
> 4-byte tablespace OID
> any number of relation OID/block OID pairings for that
> database/tablespace combination
> 4-byte zero to mark the end of the relation OID/block OID list
> and then repeat all of the above any number of times

I agree that we'd like to get the data in a binary format of some kind.

> That might be too dumb and I suspect we want some headers and a
> checksum, but we should try to somehow exploit the fact that there
> aren't likely to be many distinct databases or many distinct
> tablespaces mentioned -- whereas relation OID and block number will
> probably have a lot more entropy.

I'm not remembering exactly where this idea came from, but I don't
believe it's my own (and I think there's some tool which already does
this.. maybe it's rsync?), but I certainly don't think we want to
repeat the relation OID for every block, and I don't think we really
want to store a block number for every block. Instead, something like:

4-byte database OID
4-byte tablespace OID
relation OID

starting-ending block numbers
bitmap covering range of blocks
starting-ending block numbers
bitmap covering range of blocks
4-byte zero to mark the end of the relation
...
4-byte database OID
4-byte tablespace OID
relation OID

starting-ending block numbers
bitmap covering range of blocks
4-byte zero to mark the end of the relation
...

Only for relations which actually have changes though, of course.

Haven't implemented it, so it's entirely possible there's reasons why it
wouldn't work, but I do like the bitmap idea. I definitely think we
need a checksum, as you mentioned.

Thanks!

Stephen

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2019-04-20 01:25:14 Re: [PATCH v20] GSSAPI encryption support
Previous Message Stephen Frost 2019-04-20 00:04:41 Re: block-level incremental backup