Re: finding changed blocks using WAL scanning

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: finding changed blocks using WAL scanning
Date: 2019-04-20 20:21:52
Message-ID: CA+TgmoZLPZfmFDRGriec_dD-xVgs1j1N3=dHhg6GX13tdF-nMQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Apr 20, 2019 at 12:42 AM Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > Oh. Well, I already explained my algorithm for doing that upthread,
> > which I believe would be quite cheap.
> >
> > 1. When you generate the .modblock files, stick all the block
> > references into a buffer. qsort(). Dedup. Write out in sorted
> > order.
>
> Having all of the block references in a sorted order does seem like it
> would help, but would also make those potentially quite a bit larger
> than necessary (I had some thoughts about making them smaller elsewhere
> in this discussion). That might be worth it though. I suppose it might
> also be possible to line up the bitmaps suggested elsewhere to do
> essentially a BitmapOr of them to identify the blocks changed (while
> effectively de-duping at the same time).

I don't see why this would make them bigger than necessary. If you
sort by relfilenode/fork/blocknumber and dedup, then references to
nearby blocks will be adjacent in the file. You can then decide what
format will represent that most efficiently on output. Whether or not
a bitmap is better idea than a list of block numbers or something else
depends on what percentage of blocks are modified and how clustered
they are.

> > 2. When you want to use a bunch of .modblock files, do the same thing
> > MergeAppend does, or what merge-sort does when it does a merge pass.
> > Read the first 1MB of each file (or whatever amount). Repeatedly pull
> > an item from whichever file has the lowest remaining value, using a
> > binary heap. When no buffered data remains for a particular file,
> > read another chunk from that file.
>
> Sure, this is essentially a MergeAppend/MergeSort+GroupAgg on top to get
> the distinct set, if I'm following what you're suggesting here.

Yeah, something like that.

> > If each .modblock file covers 1GB of WAL, you could the data from
> > across 1TB of WAL using only 1GB of memory, and that's assuming you
> > have a 1MB buffer for each .modblock file. You probably don't need
> > such a large buffer. If you use, say, a 128kB buffer, you could merge
> > the data from across 8TB of WAL using 1GB of memory. And if you have
> > 8TB of WAL and you can't spare 1GB for the task of computing which
> > blocks need to be included in your incremental backup, it's time for a
> > hardware upgrade.
>
> How much additional work is it going to be to sort/dedup for each 1GB of
> WAL though, along with the resulting size?

Well all you have to do is quicksort an array with a million or so
elements. I don't know off-hand how many CPU cycles that takes but I
doubt it's a whole lot. And for the amount of CPU time and memory
that it saves you when you actually go to use the files, I think it's
got to be worth it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2019-04-20 20:32:32 Re: block-level incremental backup
Previous Message Robert Haas 2019-04-20 20:17:08 Re: finding changed blocks using WAL scanning