From: | Chris Kratz <chris(dot)kratz(at)vistashare(dot)com> |
---|---|
To: | "Thomas F(dot) O'Connell" <tfo(at)sitening(dot)com> |
Cc: | PgSQL General <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: pg_dump in a production environment |
Date: | 2005-05-23 21:54:28 |
Message-ID: | 200505231754.28376.chris.kratz@vistashare.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Hello Thomas,
We've had and have the exact same issue and have been unable to find a
satisfactory solution to the problem. Currently we "just live with it". We
do periodic backups with pg_dump on an hourly basis. During the dump, other
accesses to the db are incredibly slow making our web app feel somewhat
sluggish for 5 to 10 minutes while the db is dumped.
After a lot of research, it appears to be an i/o and memory contention issue.
Basically, the dump procedure has to pull in all data and tables into memory
to dump them which means any other requests have to pull the data they need
back off of disk (because they got paged out to make room for the dump data)
making them very slow. This is compounded by the fact that pg_dump usually
saturates your I/O throughput. Since postgres doesn't manage the file system
buffers (the os does), there appears to be no easy way to tell it to only use
x amount of memory for the dump leaving all the other memory available for
the running database. I have a hunch that the same thing happens with the
shared buffers, though I haven't proven that. This wasn't a problem for us
while the db fit into ram, but we've grown far past that point now.
The only solution we have ever found is simply to use something like a slony
slave and do dumps from the slave. The slave takes the performance hit and
your primary db keeps on running at full speed. Once the dump is done, then
the slave can "catch up" if it needs to. Unfortunately, I believe there are
issues currently with restoring off of a dump from a slave.
-Chris
On Monday 23 May 2005 04:56 pm, Thomas F. O'Connell wrote:
> Okay, I collated the three replies I got below for ease in replying.
>
> I vacuum full analyze and reindexdb approximately once a month, but I
> use pg_autovacuum as a matter of ongoing maintenance, and it seems to
> hit equilibrium pretty well and seems to prevent bloat. The last time
> I checked a vacuum analyze verbose, I had plenty of FSM to spare. The
> data grows, but it doesn't seem to grow so quickly that I'd already
> be out of FSM space.
>
> I actually run pg_dump from a remote machine, so I/O contention on
> the partition with $PGDATA shouldn't be an issue.
>
> And here is the actual command:
>
> pg_dump -h <host> -F c <database> > <dumpfile>
>
> Pretty basic, although it is compressing.
>
> As far as I can tell, the postmaster handling the dump request takes
> up quite a bit of CPU, but not itself to the point where the database
> should be unusable under ordinary circumstances. E.g., when a query/
> backend eats up that much CPU, it doesn't prevent further access.
>
> I'm suspicious more of something involving locks than of CPU.
>
> Oh, and one other small(ish) detail: the dumping client is using a
> 7.4.8 installation, whereas the server itself is 7.4.6.
>
> -tfo
>
> --
> Thomas F. O'Connell
> Co-Founder, Information Architect
> Sitening, LLC
>
> Strategic Open Source: Open Your i™
>
> http://www.sitening.com/
> 110 30th Avenue North, Suite 6
> Nashville, TN 37203-6320
> 615-260-0005
>
> > From: Scott Marlowe <smarlowe(at)g2switchworks(dot)com>
> > Date: May 23, 2005 3:18:33 PM CDT
> > To: "Thomas F. O'Connell" <tfo(at)sitening(dot)com>
> > Cc: PgSQL General <pgsql-general(at)postgresql(dot)org>
> > Subject: Re: [GENERAL] pg_dump in a production environment
> >
> > Basically, it sounds like postgresql is doing a lot of very long
> > sequential scans to do this backup. HAve you done a vacuum full
> > lately? It could be that you've got a lot of table bloat that's
> > making
> > the seq scans take so long.
> >
> > You could be I/O saturated already, and the backup is just pushing you
> > over the edge of the performance knee.
> >
> > I do a 'vacuum analyze verbose' and see if you need more fsm setup
> > for
> > your regular vacuums to keep up.
> >
> > From: "Matthew T. O'Connor" <matthew(at)zeut(dot)net>
> > Date: May 23, 2005 3:18:18 PM CDT
> > To: "Thomas F. O'Connell" <tfo(at)sitening(dot)com>
> > Cc: PgSQL General <pgsql-general(at)postgresql(dot)org>
> > Subject: Re: [GENERAL] pg_dump in a production environment
> >
> > Could this be an I/O saturation issue like the one the vacuum delay
> > settings are supposed to help with? Perhaps we could either extend
> > the vacuum delay settings to effect pg_dump, or make new option to
> > pg_dump that would have it slow down the dump.
> >
> > BTW, have you tried running pg_dump from a separate machine? Or
> > even just making sure that the dump file is being written to a
> > different disk drive than PostgreSQL is running on. All that disk
> > write activity is bound to slow the system down.
> >
> > Matthew
> >
> > From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
> > Date: May 23, 2005 3:25:23 PM CDT
> > To: "Thomas F. O'Connell" <tfo(at)sitening(dot)com>
> > Cc: PgSQL General <pgsql-general(at)postgresql(dot)org>
> > Subject: Re: [GENERAL] pg_dump in a production environment
> > Reply-To: Martijn van Oosterhout <kleptog(at)svana(dot)org>
> >
> >
> > What's you pg_dump command? Some options may take a lot of memory.
> >
> > If you list the processes while this is going on, do you see one
> > chewing all your memory? i.e what's really causing the problem...
> >
> > Hope this helps,
--
Chris Kratz
From | Date | Subject | |
---|---|---|---|
Next Message | Scott Marlowe | 2005-05-23 22:09:58 | Re: pg_dump in a production environment |
Previous Message | Scott Marlowe | 2005-05-23 21:09:50 | Re: pg_dump in a production environment |