Re: block-level incremental backup

From: Jeevan Ladhe <jeevan(dot)ladhe(at)enterprisedb(dot)com>
To: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, Stephen Frost <sfrost(at)snowman(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: block-level incremental backup
Date: 2019-08-09 00:37:14
Message-ID: CAOgcT0NqbKpQksu=Uj9th7v360838SfCKys2hf=rFHAkX6_QnA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Jeevan,

I have reviewed the backup part at code level and still looking into the
restore(combine) and functional part of it. But, here are my comments so
far:

The patches need rebase.
----------------------------------------------------
+ if (!XLogRecPtrIsInvalid(previous_lsn))
+ appendStringInfo(labelfile, "PREVIOUS WAL LOCATION: %X/%X\n",
+ (uint32) (previous_lsn >> 32), (uint32)
previous_lsn);

May be we should rename to something like:
"INCREMENTAL BACKUP START WAL LOCATION" or simply "INCREMENTAL BACKUP START
LOCATION"
to make it more intuitive?

----------------------------------------------------

+typedef struct

+{

+ uint32 magic;

+ pg_crc32c checksum;

+ uint32 nblocks;

+ uint32 blocknumbers[FLEXIBLE_ARRAY_MEMBER];

+} partial_file_header;

File header structure is defined in both the files basebackup.c and
pg_combinebackup.c. I think it is better to move this to
replication/basebackup.h.

----------------------------------------------------

+ bool isrelfile = false;

I think we can avoid having flag isrelfile in sendFile().
Something like this:

if (startincrptr && OidIsValid(dboid) && looks_like_rel_name(filename))
{
//include the code here that is under "if (isrelfile)" block.
}
else
{
_tarWriteHeader(tarfilename, NULL, statbuf, false);
while ((cnt = fread(buf, 1, Min(sizeof(buf), statbuf->st_size - len), fp))
> 0)
{
...
}
}

----------------------------------------------------

Also, having isrelfile as part of following condition:
{code}
+ while (!isrelfile &&
+ (cnt = fread(buf, 1, Min(sizeof(buf), statbuf->st_size - len),
fp)) > 0)
{code}

is confusing, because even the relation files in full backup are going to be
backed up by this loop only, but still, the condition reads '(!isrelfile
&&...)'.

----------------------------------------------------

verify_page_checksum()
{
while(1)
{
....
break;
}
}

IMHO, while labels are not advisable in general, it may be better to use a
label
here rather than a while(1) loop, so that we can move to the label in case
we
want to retry once. I think here it opens doors for future bugs if someone
happens to add code here, ending up adding some condition and then the
break becomes conditional. That will leave us in an infinite loop.

----------------------------------------------------

+/* magic number in incremental backup's .partial file */
+#define INCREMENTAL_BACKUP_MAGIC 0x494E4352

Similar to structure partial_file_header, I think above macro can also be
moved
to basebackup.h instead of defining it twice.

----------------------------------------------------

In sendFile():

+ buf = (char *) malloc(RELSEG_SIZE * BLCKSZ);

I think this is a huge memory request (1GB) and may fail on busy/loaded
server at
times. We should check for failures of malloc, maybe throw some error on
getting ENOMEM as errno.

----------------------------------------------------

+ /* Perform incremenatl backup stuff here. */
+ if ((cnt = fread(buf, 1, Min(RELSEG_SIZE * BLCKSZ,
statbuf->st_size), fp)) > 0)
+ {

Here, should not we expect statbuf->st_size < (RELSEG_SIZE * BLCKSZ), and it
should be safe to read just statbuf_st_size always I guess? But, I am ok
with
having this extra guard here.

----------------------------------------------------

In sendFile(), I am sorry if I am missing something, but I am not able to
understand why 'cnt' and 'i' should have different values when they are
being
passed to verify_page_checksum(). I think passing only one of them should be
sufficient.

----------------------------------------------------

+ XLogRecPtr pglsn;
+
+ for (i = 0; i < cnt / BLCKSZ; i++)
+ {

Maybe we should just have a variable no_of_blocks to store a number of
blocks,
rather than calculating this say RELSEG_SIZE(i.e. 131072) times in the worst
case.

----------------------------------------------------
+ len += cnt;
+ throttle(cnt);
+ }

Sorry if I am missing something, but, should not it be just:

len = cnt;

----------------------------------------------------

As I said earlier in my previous email, we now do not need
+decode_lsn_internal()
as it is already taken care by the introduction of function
pg_lsn_in_internal().

Regards,
Jeevan Ladhe

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2019-08-09 00:41:15 Re: POC: converting Lists into arrays
Previous Message Bruce Momjian 2019-08-09 00:24:00 Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)