From: | Bruce Momjian <bruce(at)momjian(dot)us> |
---|---|
To: | Stephen Frost <sfrost(at)snowman(dot)net> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: block-level incremental backup |
Date: | 2019-04-15 16:48:57 |
Message-ID: | 20190415164857.nosv5foqbcmeyhcl@momjian.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Apr 15, 2019 at 09:01:11AM -0400, Stephen Frost wrote:
> Greetings,
>
> * Robert Haas (robertmhaas(at)gmail(dot)com) wrote:
> > Several companies, including EnterpriseDB, NTT, and Postgres Pro, have
> > developed technology that permits a block-level incremental backup to
> > be taken from a PostgreSQL server. I believe the idea in all of those
> > cases is that non-relation files should be backed up in their
> > entirety, but for relation files, only those blocks that have been
> > changed need to be backed up.
>
> I love the general idea of having additional facilities in core to
> support block-level incremental backups. I've long been unhappy that
> any such approach ends up being limited to a subset of the files which
> need to be included in the backup, meaning the rest of the files have to
> be backed up in their entirety. I don't think we have to solve for that
> as part of this, but I'd like to see a discussion for how to deal with
> the other files which are being backed up to avoid needing to just
> wholesale copy them.
I assume you are talking about non-heap/index files. Which of those are
large enough to benefit from incremental backup?
> > I would like to propose that we should
> > have a solution for this problem in core, rather than leaving it to
> > each individual PostgreSQL company to develop and maintain their own
> > solution.
>
> I'm certainly a fan of improving our in-core backup solutions.
>
> I'm quite concerned that trying to graft this on to pg_basebackup
> (which, as you note later, is missing an awful lot of what users expect
> from a real backup solution already- retention handling, parallel
> capabilities, WAL archive management, and many more... but also is just
> not nearly as developed a tool as the external solutions) is going to
> make things unnecessairly difficult when what we really want here is
> better support from core for block-level incremental backup for the
> existing external tools to leverage.
I think there is some interesting complexity brought up in this thread.
Which options are going to minimize storage I/O, network I/O, have only
background overhead, allow parallel operation, integrate with
pg_basebackup. Eventually we will need to evaluate the incremental
backup options against these criteria.
--
Bruce Momjian <bruce(at)momjian(dot)us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2019-04-15 16:55:33 | Re: Multivariate MCV lists -- pg_mcv_list_items() seems to be broken |
Previous Message | Daniel Verite | 2019-04-15 16:35:12 | Re: Cleanup/remove/update references to OID column |