delta copies of pg_dump files

From: Robert Fitzpatrick <lists(at)webtent(dot)net>
To: PostgreSQL <pgsql-general(at)postgresql(dot)org>
Subject: delta copies of pg_dump files
Date: 2007-09-24 15:53:44
Message-ID: 1190649224.28091.22.camel@columbus.webtent.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Trying to format our pg_dump files so that we can take advantage of
rsync delta copies and the problem seems to be the sequences and maybe
very active tables not allowing much matched data in the transfer. I
have a 3GB dump file in plain text and broke off the head of two dump
files into their own and diff'd to see the following. I assume these
differences are what is causing me to get a very low percentage of
matched data in my transfer?

esmtp# ls -la data/maia.sql
-rw-r--r-- 1 root wheel 2906254629 Sep 23 22:46 data/maia.sql
esmtp# ls -la data.bak/maia.sql
-rw-r--r-- 1 root wheel 2935474571 Sep 23 19:01 data.bak/maia.sql
esmtp# head -500 data/maia.sql > new.sql
esmtp# head -500 data.bak/maia.sql > old.sql
esmtp# diff new.sql old.sql
38c38
< SELECT pg_catalog.setval('maia_mail_id_seq', 567254, true);
---
> SELECT pg_catalog.setval('maia_mail_id_seq', 565121, true);
52c52
< SELECT pg_catalog.setval('maia_stats_history_id_seq', 77273, true);
---
> SELECT pg_catalog.setval('maia_stats_history_id_seq', 76573, true);
80c80
< SELECT pg_catalog.setval('maia_viruses_id_seq', 320, true);
---
> SELECT pg_catalog.setval('maia_viruses_id_seq', 319, true);
167d166
< vscan zymotechnic(at)norfinancement(dot)com 88.247 1 7.8099999999999996
256d254
< vscan js_rodzen(at)agraturf(dot)com 84.255 1 17.577000000000002
403d400
< vscan vojtekkaleem(at)sideroad(dot)com 103.188 1 20.062000000000001
467d463
< vscan scho(dot)olonl(dot)ine(at)twilightsnack(dot)info 216.75 1 7.1710000000000003
500a497,500
> vscan gpm(at)jagyerin(dot)com 209.44 2 51.013000000000005
> vscan exiaf_radar_guy38(at)yahoo(dot)co(dot)in 59.94 2 0.58199999999999996
> vscan detoxfootpatch(at)marketingprovides(dot)net 66.248 2 14.827999999999999
> vscan lindoraleanforlife(at)estatemontr(dot)net 216.188 2 17.859999999999999

This is a very active database used as a mail cache for a couple of mail
gateways using Postfix+amavisd-maia+SA+clamAV on FreeBSD. Can I move
sequences to the end and will this help my cause? Seems the records are
changing as well, I'm not sure why there is so much changing in the
front of these dumps. Does pg_dump sort by OID? Looking in the dump
file, I see these records are coming from the AWL table, could this be
changing drastically all the time, I guess a question for my amavisd or
maia lists?

What can be done to best prepare dumps for this type of data transfer,
the file sizes are not much different and we want to save that bandwidth
to our remote facility? Thanks for the help!

--
Robert

Browse pgsql-general by date

  From Date Subject
Next Message David Brain 2007-09-24 15:55:26 Re: pg_dump problem: 'pg_dump: schema with OID 1515546 does not exist'
Previous Message Collin 2007-09-24 15:50:32 Re: Migration from PervasiveSQL