Re: Backup taking long time !!!

From: Vladimir Borodin <root(at)simply(dot)name>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Dinesh Chandra 12108 <Dinesh(dot)Chandra(at)cyient(dot)com>, "Madusudanan(dot)B(dot)N" <b(dot)n(dot)madusudanan(at)gmail(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Backup taking long time !!!
Date: 2017-01-20 14:45:51
Message-ID: 7696A57C-C871-4A45-9141-FFD5F75CCB65@simply.name
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance


> 20 янв. 2017 г., в 16:40, Stephen Frost <sfrost(at)snowman(dot)net> написал(а):
>
> Vladimir,
>
>> Increments in pgbackrest are done on file level which is not really efficient. We have done parallelism, compression and page-level increments (9.3+) in barman fork [1], but unfortunately guys from 2ndquadrant-it don’t hurry to work on it.
>
> We're looking at page-level incremental backup in pgbackrest also. For
> larger systems, we've not heard too much complaining about it being
> file-based though, which is why it hasn't been a priority. Of course,
> the OP is on 9.1 too, so.

Well, we have forked barman and made everything from the above just because we needed ~ 2 PB of disk space for storing backups for our ~ 300 TB of data. (Our recovery window is 7 days) And on 5 TB database it took a lot of time to make/restore a backup.

>
> As for your fork, well, I can't say I really blame the barman folks for
> being cautious- that's usually a good thing in your backup software. :)

The reason seems to be not the caution but the lack of time for working on it. But yep, it took us half a year to deploy our fork everywhere. And it would take much more time if we didn’t have system for checking backups consistency.

>
> I'm curious how you're handling compressed page-level incremental
> backups though. I looked through barman-incr and it wasn't obvious to
> me what was going wrt how the incrementals are stored, are they ending
> up as sparse files, or are you actually copying/overwriting the prior
> file in the backup repository?

No, we do store each file in the following way. At the beginning you write a map of changed pages. At second you write changed pages themselves. The compression is streaming so you don’t need much memory for that but the downside of this approach is that you read each datafile twice (we believe in page cache here).

> Apologies, python isn't my first
> language, but the lack of any comment anywhere in that file doesn't
> really help.

Not a problem. Actually, it would be much easier to understand if it was a series of commits rather than one commit that we do ammend and force-push after each rebase on vanilla barman. We should add comments.

--
May the force be with you…
https://simply.name

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Stephen Frost 2017-01-20 15:06:46 Re: Backup taking long time !!!
Previous Message Stephen Frost 2017-01-20 13:40:54 Re: Backup taking long time !!!