From: | Holger Jakobs <holger(at)jakobs(dot)com> |
---|---|
To: | pgsql-admin(at)lists(dot)postgresql(dot)org |
Subject: | Re: How to get a more RSYNC compatible output of pg_dump? |
Date: | 2022-05-16 10:52:41 |
Message-ID: | a7e6aff9-7f13-5ebd-db1c-2cb899664909@jakobs.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-admin |
Am 16.05.22 um 09:56 schrieb Thorsten Schöning:
> Hi everyone,
>
> for various historical reasons I maintain a database containing large
> file uploads, which makes uncompressed output of pg_dump ~200 GiB in
> size currently. I'm storing that dump to some NAS and am trying to
> forward it from there using RSYNC to multiple different additional
> offsite USB disks.
>
> I'm doing the same with the files directory of Postgres already after
> taking BTRFS snapshots etc. and for those files things work pretty
> well with RSYNC. Lots of files are skipped entirely, some are slightly
> updated in-place, some updates are a bit larger depending on the
> actual changes and when RSYNC executed last etc.
>
> Though, with the large dumps it seems to me that with every slight
> change in the actual data the entire dump gets downloaded again. I'm
> already using uncompressed dumps in the hope that the output is more
> stable and RSYNC better able to recognize unchanged parts. But I guess
> that most changes in the dumped data simply result in all subsequent
> data being that misplaced compared to what RSYNC reads against, that
> it's like downloading the whole file again in the end.
>
> Is that simply the way it is or are there some optimizations possible
> when using pg_dump? Am using Postgres 11 and don't see anything which
> seems to help in this use-case.
>
> Thanks!
>
> Mit freundlichen Grüßen
>
> Thorsten Schöning
>
Hi Thorsten,
This is an rsync question, not a pg_dump question.
If you want to sync a new version of a file without transferring the
whole thing, you have to use the option -c or --checksum.
This works well only if some blocks of the file have changed, while most
others haven't. This won't be the case of a pg_dump.
So I don't see a way of re-syncing the way you expect it to.
Regards,
Holger
--
Holger Jakobs, Bergisch Gladbach, Tel. +49-178-9759012
From | Date | Subject | |
---|---|---|---|
Next Message | hubert depesz lubaczewski | 2022-05-16 12:28:29 | Re: How to get a more RSYNC compatible output of pg_dump? |
Previous Message | Thorsten Schöning | 2022-05-16 07:56:34 | How to get a more RSYNC compatible output of pg_dump? |