Re: Streaming Replication: Observations, Questions and Comments

From: Greg Smith <greg(at)2ndQuadrant(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Streaming Replication: Observations, Questions and Comments
Date: 2011-08-24 16:30:06
Message-ID: 4E55270E.10509@2ndQuadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 08/24/2011 11:33 AM, Samba wrote:
> One strange thing I noticed is that the pg_xlogs on the master have
> outsized the actual data stored in the database by at least 3-4 times,
> which was quite surprising. I'm not sure if 'restore_command' has
> anything to do with it. I did not understand why transaction logs
> would need to be so many times larger than the actual size of the
> database, have I done something wrong somewhere?

That's common to see. Systems that regularly UPDATE the same rows often
can easily end up with a WAL stream much larger than the database. The
WAL data contains enough information to replay every point in time from
the base backup until the current time. That can be significantly
larger than the database, which just holds the latest copy of the data.

One of the biggest things that makes your WAL large are the full page
writes that protect against incomplete writes. See "question regarding
full_page_writes" thread happening on this list recently for details.
Each time you touch a page, per checkpoint, another full copy of that
page is written out.

What I have to do in a lot of cases is significantly decrease the number
of checkpoints in order to keep this overhead under control. The
default config has a checkpoint every checkpoint_segments of work, and
every checkpoint_timeout of time. That makes for a checkpoint every 5
minutes, and even more often under heavy load.

If you increase checkpoint_segments a whole lot, all of your checkpoints
will be based on the timeout instead. Then you can see how WAL load
decreases as you increase checkpoint_timeout. I've had to set
checkpoint_timeout as high as 30 minutes before on busy systems, to
lower the WAL overhead.

--
Greg Smith 2ndQuadrant US greg(at)2ndQuadrant(dot)com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Richard Broersma 2011-08-24 16:30:41 Feature Request: DDL + RegExp - definitions
Previous Message bricklen 2011-08-24 16:22:21 plperlu function caused a segmentation fault