Re: finding changed blocks using WAL scanning

From: Bruce Momjian <bruce(at)momjian(dot)us>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: finding changed blocks using WAL scanning
Date: 2019-04-18 19:51:57
Message-ID: 20190418195157.q7elkivbun2lqic4@momjian.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Apr 18, 2019 at 03:43:30PM -0400, Robert Haas wrote:
> You can make it kinda make sense by saying "the blocks modified by
> records *beginning in* segment XYZ" or alternatively "the blocks
> modified by records *ending in* segment XYZ", but that seems confusing
> to me. For example, suppose you decide on the first one --
> 000000010000000100000068.modblock will contain all blocks modified by
> records that begin in 000000010000000100000068. Well, that means that
> to generate the 000000010000000100000068.modblock, you will need
> access to 000000010000000100000068 AND probably also
> 000000010000000100000069 and in rare cases perhaps
> 00000001000000010000006A or even later files. I think that's actually
> pretty confusing.
>
> It seems better to me to give the files names like
> ${TLI}.${STARTLSN}.${ENDLSN}.modblock, e.g.
> 00000001.0000000168000058.00000001687DBBB8.modblock, so that you can
> see exactly which *records* are covered by that segment.

How would you choose the STARTLSN/ENDLSN? If you could do it per
checkpoint, rather than per-WAL, I think that would be great.

> And I suspect it may also be a good idea to bunch up the records from
> several WAL files. Especially if you are using 16MB WAL files,
> collecting all of the block references from a single WAL file is going
> to produce a very small file. I suspect that the modified block files
> will end up being 100x smaller than the WAL itself, perhaps more, and
> I don't think anybody will appreciate us adding another PostgreSQL
> systems that spews out huge numbers of tiny little files. If, for
> example, somebody's got a big cluster that is churning out a WAL
> segment every second, they would probably still be happy to have a new
> modified block file only, say, every 10 seconds.

Agreed.

--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2019-04-18 20:25:24 Re: finding changed blocks using WAL scanning
Previous Message Robert Haas 2019-04-18 19:51:14 Re: finding changed blocks using WAL scanning