Re: block-level incremental backup

From: Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>
To: vignesh C <vignesh21(at)gmail(dot)com>
Cc: Jeevan Chalke <jeevan(dot)chalke(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Anastasia Lubennikova <a(dot)lubennikova(at)postgrespro(dot)ru>, Stephen Frost <sfrost(at)snowman(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: block-level incremental backup
Date: 2019-08-06 18:31:50
Message-ID: CALtqXTfuv1ReGyw6MH86YG49Jr6uwDiV=vbz4pvwFxO=bN=iMg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I have not looked at the patch in detail, but just some nits from my side.

On Fri, Aug 2, 2019 at 6:13 PM vignesh C <vignesh21(at)gmail(dot)com> wrote:

> On Thu, Aug 1, 2019 at 5:06 PM Jeevan Chalke
> <jeevan(dot)chalke(at)enterprisedb(dot)com> wrote:
> >
> > On Tue, Jul 30, 2019 at 9:39 AM Jeevan Chalke <
> jeevan(dot)chalke(at)enterprisedb(dot)com> wrote:
> >>
> >> I am almost done writing the patch for pg_combinebackup and will post
> soon.
> >
> >
> > Attached patch which implements the pg_combinebackup utility used to
> combine
> > full basebackup with one or more incremental backups.
> >
> > I have tested it manually and it works for all best cases.
> >
> > Let me know if you have any inputs/suggestions/review comments?
> >
> Some comments:
> 1) There will be some link files created for tablespace, we might
> require some special handling for it
>
> 2)
> + while (numretries <= maxretries)
> + {
> + rc = system(copycmd);
> + if (rc == 0)
> + return;
>
> Use API to copy the file instead of "system", better to use the secure
copy.

> + pg_log_info("could not copy, retrying after %d seconds",
> + sleeptime);
> + pg_usleep(numretries++ * sleeptime * 1000000L);
> + }
> Retry functionality is hanlded only for copying of full files, should
> we handle retry for copying of partial files
>
> The log and the sleep time does not match, you are multiplying sleeptime
with numretries++ and logging only "sleeptime"

Why we are retiring here, capture proper copy error and act accordingly.
Blindly retiring does not make sense.

3)
> + maxretries = atoi(optarg);
> + if (maxretries < 0)
> + {
> + pg_log_error("invalid value for maxretries");
> + fprintf(stderr, _("%s: -r maxretries must be >= 0\n"), progname);
> + exit(1);
> + }
> + break;
> + case 's':
> + sleeptime = atoi(optarg);
> + if (sleeptime <= 0 || sleeptime > 60)
> + {
> + pg_log_error("invalid value for sleeptime");
> + fprintf(stderr, _("%s: -s sleeptime must be between 1 and 60\n"),
> progname);
> + exit(1);
> + }
> + break;
> we can have some range for maxretries similar to sleeptime
>
> 4)
> + fp = fopen(filename, "r");
> + if (fp == NULL)
> + {
> + pg_log_error("could not read file \"%s\": %m", filename);
> + exit(1);
> + }
> +
> + labelfile = malloc(statbuf.st_size + 1);
> + if (fread(labelfile, 1, statbuf.st_size, fp) != statbuf.st_size)
> + {
> + pg_log_error("corrupted file \"%s\": %m", filename);
> + free(labelfile);
> + exit(1);
> + }
> Should we check for malloc failure
>
> Use pg_malloc instead of malloc

> 5) Should we add display of progress as backup may take some time,
> this can be added as enhancement. We can get other's opinion on this.
>
> Yes, we should, but this is not the right time to do that.

> 6)
> + if (nIncrDir == MAX_INCR_BK_COUNT)
> + {
> + pg_log_error("too many incremental backups to combine");
> + fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
> progname);
> + exit(1);
> + }
> +
> + IncrDirs[nIncrDir] = optarg;
> + nIncrDir++;
> + break;
>
> If the backup count increases providing the input may be difficult,
> Shall user provide all the incremental backups from a parent folder
> and can we handle the ordering of incremental backup internally
>
> Why we have that limit at first place?

> 7)
> + if (isPartialFile)
> + {
> + if (verbose)
> + pg_log_info("combining partial file \"%s.partial\"", fn);
> +
> + combine_partial_files(fn, IncrDirs, nIncrDir, subdirpath, outfn);
> + }
> + else
> + copy_whole_file(infn, outfn);
>
> Add verbose for copying whole file
>
> 8) We can also check if approximate space is available in disk before
> starting combine backup, this can be added as enhancement. We can get
> other's opinion on this.
>
> 9)
> + printf(_(" -i, --incr-backup=DIRECTORY incremental backup directory
> (maximum %d)\n"), MAX_INCR_BK_COUNT);
> + printf(_(" -o, --output-dir=DIRECTORY combine backup into
> directory\n"));
> + printf(_("\nGeneral options:\n"));
> + printf(_(" -n, --no-clean do not clean up after
> errors\n"));
>
> Combine backup into directory can be combine backup directory
>
> 10)
> +/* Max number of incremental backups to be combined. */
> +#define MAX_INCR_BK_COUNT 10
> +
> +/* magic number in incremental backup's .partial file */
>
> MAX_INCR_BK_COUNT can be increased little, some applications use 1
> full backup at the beginning of the month and use 30 incremental
> backups rest of the days in the month
>
> Regards,
> Vignesh
> EnterpriseDB: http://www.enterprisedb.com
>
>
>

--
Ibrar Ahmed

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ibrar Ahmed 2019-08-06 18:37:21 Re: block-level incremental backup
Previous Message Larry Rosenman 2019-08-06 18:19:34 Re: How am I supposed to fix this?