From: | Craig Jackson <craig(dot)jackson(at)broadcom(dot)com> |
---|---|
To: | Craig James <cjames(at)emolecules(dot)com> |
Cc: | pgsql-performance(at)lists(dot)postgresql(dot)org |
Subject: | Re: Postgres backup tool recommendations for multi-terabyte database in Google Cloud |
Date: | 2019-12-05 21:05:05 |
Message-ID: | CA+R1LV7uhXHSYgBM0avnW8y2sYqXzMgnyBHB55ww4GD_d4Yg6w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Thanks, I'll check it out.
On Thu, Dec 5, 2019 at 12:51 PM Craig James <cjames(at)emolecules(dot)com> wrote:
> On Thu, Dec 5, 2019 at 9:48 AM Craig Jackson <craig(dot)jackson(at)broadcom(dot)com>
> wrote:
>
>> Hi,
>>
>> We are in the process of migrating an oracle database to postgres in
>> Google Cloud and are investigating backup/recovery tools. The database is
>> size is > 20TB. We have an SLA that requires us to be able to complete a
>> full restore of the database within 24 hours. We have been testing
>> pgbackreset, barman, and GCP snapshots but wanted to see if there are any
>> other recommendations we should consider.
>>
>> *Desirable features*
>> - Parallel backup/recovery
>> - Incremental backups
>> - Backup directly to a GCP bucket
>> - Deduplication/Compression
>>
>
> For your 24-hour-restore requirement, there's an additional feature you
> might consider: incremental restore, or what you might call "recovery in
> place"; that is, the ability to keep a more-or-less up-to-date copy, and
> then in an emergency only restore the diffs on the file system. pgbackup
> uses a built-in rsync-like feature, plus a client-server architecture, that
> allows it to quickly determine which disk blocks need to be updated.
> Checksums are computed on each side, and data are only transferred if
> checksums differ. It's very efficient. I assume that a 20 TB database is
> mostly static, with only a small fraction of the data updated in any month.
> I believe the checksums are precomputed and stored in the pgbackrest
> repository, so you can even do this from an Amazon S3 (or whatever Google's
> Cloud equivalent is for low-cost storage) backup with just modest bandwidth
> usage.
>
> In a cloud environment, you can do this on modestly-priced hardware (a few
> CPUs, modest memory). In the event of a failover, unmount your backup disk,
> spin up a big server, mount the database, do the incremental restore, and
> you're in business.
>
> Craig (James)
>
>
>> Any suggestions would be appreciated.
>>
>> Craig Jackson
>>
>
>
>
--
Craig
From | Date | Subject | |
---|---|---|---|
Next Message | Nikolay Samokhvalov | 2019-12-05 21:08:41 | Re: Postgres backup tool recommendations for multi-terabyte database in Google Cloud |
Previous Message | Craig James | 2019-12-05 19:51:03 | Re: Postgres backup tool recommendations for multi-terabyte database in Google Cloud |