Re: Backup using GiT?

From: Chander Ganesan <chander(at)otg-nc(dot)com>
To: "James B(dot) Byrne" <byrnejb(at)harte-lyne(dot)ca>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Backup using GiT?
Date: 2008-06-13 18:46:20
Message-ID: 4852C07C.9020508@otg-nc.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

James B. Byrne wrote:
> I have recently had to teach myself how to use git and the thought came to me
> that this tool might provide a fairly low setup cost way of passing pg_dumps
> over the network to our off site data store. Think Rsync, but on a file
> content basis; just the content diff gets transmitted.
>
> GiT works by compressing deltas of the contents of successive versions of file
> systems under repository control. It treats binary objects as just another
> object under control. The question is, are successive (compressed) dumps of
> an altered database sufficiently similar to make the deltas small enough to
> warrant this approach?
>
> Comments? (not my my sanity, please)
>
It probably depends on the number of changes in the database. For
example, a vacuum followed by an insert could result in records that
were previously at the start of the dump being somewhere else -like the
middle of the dump (i.e., a dead tuple is marked as available, then the
space is "used" for an insert). In such a case, you would end up with a
row that was unchanged, but in a different location in the file. Would
GIT then back that up? I would think so. So in essence you'd be
getting "at least a diff, but likely more" . Of course, I'm assuming
you are just dumping the data in a table using pg_dump....once you start
talking about a dumpall, you might find that smaller changes (i.e., give
a user a new privilege) causes stuff to be offset more.... Add
compression into the mix and I think you could find that there are
little/no similarities..

On the other hand, if you were only doing inserts into an optimized (no
dead tuples) table, I would think that you'd get a much better result.

Perhaps you would be better off using PITR in such cases?

--
Chander Ganesan
Open Technology Group, Inc.
One Copley Parkway, Suite 210
Morrisville, NC 27560
919-463-0999/877-258-8987
http://www.otg-nc.com

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2008-06-13 19:51:16 Re: Backup using GiT?
Previous Message D Galen 2008-06-13 18:18:46 Lost psql.exe on 8.3.3 upgrade