Re: block-level incremental backup

From: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>
To: Jeevan Ladhe <jeevan(dot)ladhe(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, Stephen Frost <sfrost(at)snowman(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: block-level incremental backup
Date: 2019-08-16 10:42:52
Message-ID: CAM2+6=WSHGyj-aCc96vqfbGqdgePuVWRWgjkZwT6heTDbLk=eA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Aug 9, 2019 at 6:07 AM Jeevan Ladhe <jeevan(dot)ladhe(at)enterprisedb(dot)com>
wrote:

> Hi Jeevan,
>
> I have reviewed the backup part at code level and still looking into the
> restore(combine) and functional part of it. But, here are my comments so
> far:
>

Thank you Jeevan Ladhe for reviewing the changes.

>
> The patches need rebase.
>

Done.

> May be we should rename to something like:
> "INCREMENTAL BACKUP START WAL LOCATION" or simply "INCREMENTAL BACKUP
> START LOCATION"
> to make it more intuitive?
>

As discussed, used "INCREMENTAL BACKUP REFERENCE WAL LOCATION".

File header structure is defined in both the files basebackup.c and
> pg_combinebackup.c. I think it is better to move this to
> replication/basebackup.h.
>

Yep. Was that in my cleanup list. Done now.

> I think we can avoid having flag isrelfile in sendFile().
> Something like this:
>
Also, having isrelfile as part of following condition:
> is confusing, because even the relation files in full backup are going to
> be
> backed up by this loop only, but still, the condition reads '(!isrelfile
> &&...)'.
>

In the refactored patch I have moved full backup code in a separate
function.
And now all incremental backup code is also done in its own function.
Hopefully, the code is now more readable.

>
> IMHO, while labels are not advisable in general, it may be better to use a
> label
> here rather than a while(1) loop, so that we can move to the label in case
> we
> want to retry once. I think here it opens doors for future bugs if someone
> happens to add code here, ending up adding some condition and then the
> break becomes conditional. That will leave us in an infinite loop.
>

I kept it as is as I don't see any correctness issue here.

Similar to structure partial_file_header, I think above macro can also be
> moved
> to basebackup.h instead of defining it twice.
>

Yes. Done.

> I think this is a huge memory request (1GB) and may fail on busy/loaded
> server at
> times. We should check for failures of malloc, maybe throw some error on
> getting ENOMEM as errno.
>

Agree. Done.

> Here, should not we expect statbuf->st_size < (RELSEG_SIZE * BLCKSZ), and
> it
> should be safe to read just statbuf_st_size always I guess? But, I am ok
> with
> having this extra guard here.
>

Yes, we can do this way. Added an Assert() before that and used just
statbuf->st_size.

In sendFile(), I am sorry if I am missing something, but I am not able to
> understand why 'cnt' and 'i' should have different values when they are
> being
> passed to verify_page_checksum(). I think passing only one of them should
> be
> sufficient.
>

As discussed offline, you meant to say i and blkno.
These two are different. i represent the current block offset from the read
buffer whereas blkno is the offset from the start of the page. For
incremental
backup, they are same as we read the whole file but they are different in
case
of regular full backup where we read 4 blocks at a time. i value there will
be
between 0 and 3.

> Maybe we should just have a variable no_of_blocks to store a number of
> blocks,
> rather than calculating this say RELSEG_SIZE(i.e. 131072) times in the
> worst
> case.
>

OK. Done.

> Sorry if I am missing something, but, should not it be just:
>
> len = cnt;
>

Yeah. Done.

> As I said earlier in my previous email, we now do not need
> +decode_lsn_internal()
> as it is already taken care by the introduction of function
> pg_lsn_in_internal().
>

Yes. Done that and rebased on latest HEAD.

>
> Regards,
> Jeevan Ladhe
>

Patches attached in the previous reply.

--
Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ibrar Ahmed 2019-08-16 11:12:32 Re: block-level incremental backup
Previous Message Jeevan Chalke 2019-08-16 10:23:35 Re: block-level incremental backup