Re: pg_xlog size growing untill it fills the partition

From: Ken Brush <kbrush(at)gmail(dot)com>
To: Michal TOMA <mt(at)sicoop(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: pg_xlog size growing untill it fills the partition
Date: 2013-10-03 22:16:28
Message-ID: CANCJzPa1asjuw1pw3THfksDMTh5wgz7uVt=i=pSy114rz40sGQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Try setting the following in your postgresql.conf:

wal_keep_segments = 0

On Thu, Oct 3, 2013 at 2:56 PM, Michal TOMA <mt(at)sicoop(dot)com> wrote:

> Hello,
>
> I have a problem on my pg 9.2.4 setup (OpenSuse 12.2, kernel 3.2.13).
> My pg_xlog directory is growing uncontrolably untill it fills the
> partition. The database is under heavy write load and is spread on two
> tablesapces one on a ssd software raid1 partition and a second one on a hdd
> software raid1 partition.
> I have no wal archiving enabled nor any replication.
>
> I have tried different checkpoint related parameters without any noticable
> improvement.
> Now I have:
> checkpoint_completion_target = 0.9
> wal_buffers = 8MB
> checkpoint_segments = 16
> checkpoint_timeout = 20min
> shared_buffers = 2GB
> log_checkpoints = on
>
> This is what I can see in the log:
> 2013-10-03 13:58:56 CEST LOG: checkpoint starting: xlog
> 2013-10-03 13:59:56 CEST LOG: checkpoint complete: wrote 448 buffers
> (0.2%); 0 transaction log file(s) added, 9 removed, 18 recycled;
> write=39.144 s, sync=21.136 s, total=60.286 s; sync files=380,
> longest=14.517 s, average=0.055 s
> 2013-10-03 14:04:07 CEST LOG: checkpoint starting: xlog
> 2013-10-03 15:27:01 CEST LOG: checkpoint complete: wrote 693 buffers
> (0.3%); 0 transaction log file(s) added, 0 removed, 16 recycled;
> write=90.775 s, sync=4883.295 s, total=4974.074 s; sync files=531,
> longest=152.855 s, average=9.196 s
> 2013-10-03 15:27:01 CEST LOG: checkpoint starting: xlog time
> 2013-10-03 19:06:30 CEST LOG: checkpoint complete: wrote 3467 buffers
> (1.3%); 0 transaction log file(s) added, 0 removed, 16 recycled;
> write=122.555 s, sync=13046.077 s, total=13168.637 s; sync files=650,
> longest=234.697 s, average=20.069 s
> 2013-10-03 19:06:30 CEST LOG: checkpoint starting: xlog time
> 2013-10-03 22:30:25 CEST LOG: checkpoint complete: wrote 10198 buffers
> (3.9%); 0 transaction log file(s) added, 216 removed, 33 recycled;
> write=132.229 s, sync=12102.311 s, total=12234.608 s; sync files=667,
> longest=181.374 s, average=18.144 s
> 2013-10-03 22:30:25 CEST LOG: checkpoint starting: xlog time
>
> I ran pg_test_fsync on the hard disk when postgresql is down and I have
> the following results:
>
> -------------------------------------------------------------------------------------------------------------
> O_DIRECT supported on this platform for open_datasync and open_sync.
>
> Compare file sync methods using one 8kB write:
> (in wal_sync_method preference order, except fdatasync
> is Linux's default)
> open_datasync 59.639 ops/sec
> fdatasync 43.959 ops/sec
> fsync 34.497 ops/sec
> fsync_writethrough n/a
> open_sync 35.612 ops/sec
>
> Compare file sync methods using two 8kB writes:
> (in wal_sync_method preference order, except fdatasync
> is Linux's default)
> open_datasync 24.199 ops/sec
> fdatasync 38.680 ops/sec
> fsync 35.412 ops/sec
> fsync_writethrough n/a
> open_sync 13.337 ops/sec
>
> Compare open_sync with different write sizes:
> (This is designed to compare the cost of writing 16kB
> in different write open_sync sizes.)
> 1 * 16kB open_sync write 5.499 ops/sec
> 2 * 8kB open_sync writes 21.412 ops/sec
> 4 * 4kB open_sync writes 13.065 ops/sec
> 8 * 2kB open_sync writes 6.720 ops/sec
> 16 * 1kB open_sync writes 3.320 ops/sec
>
> Test if fsync on non-write file descriptor is honored:
> (If the times are similar, fsync() can sync data written
> on a different descriptor.)
> write, fsync, close 36.353 ops/sec
> write, close, fsync 37.347 ops/sec
>
> Non-Sync'ed 8kB writes:
> write 500365.249 ops/sec
>
> -------------------------------------------------------------------------------------------------------------
>
> When the server is up and running under the usual load I get the following
> results:
>
> -------------------------------------------------------------------------------------------------------------
> 2 seconds per test
> O_DIRECT supported on this platform for open_datasync and open_sync.
>
> Compare file sync methods using one 8kB write:
> (in wal_sync_method preference order, except fdatasync
> is Linux's default)
> open_datasync 0.369 ops/sec
> fdatasync 0.575 ops/sec
> fsync 0.125 ops/sec
> fsync_writethrough n/a
> open_sync 0.222 ops/sec
>
> Compare file sync methods using two 8kB writes:
> (in wal_sync_method preference order, except fdatasync
> is Linux's default)
> open_datasync 0.383 ops/sec
> fdatasync 2.171 ops/sec
> fsync 1.318 ops/sec
> fsync_writethrough n/a
> open_sync 0.929 ops/sec
>
> Compare open_sync with different write sizes:
> (This is designed to compare the cost of writing 16kB
> in different write open_sync sizes.)
> 1 * 16kB open_sync write 0.079 ops/sec
> 2 * 8kB open_sync writes 0.041 ops/sec
> 4 * 4kB open_sync writes 0.194 ops/sec
> 8 * 2kB open_sync writes 0.013 ops/sec
> 16 * 1kB open_sync writes 0.005 ops/sec
>
> Test if fsync on non-write file descriptor is honored:
> (If the times are similar, fsync() can sync data written
> on a different descriptor.)
> write, fsync, close 0.098 ops/sec
> write, close, fsync 0.067 ops/sec
>
> Non-Sync'ed 8kB writes:
> write 0.102 ops/sec
>
> -------------------------------------------------------------------------------------------------------------
>
> I need to tell to the server to limit the amount of wal files in pg_xlog
> somehow whatever the efect on the performance could be.
>
> As for now I have to monitor the disk size and manually restart the
> server. It takes a few seconds to process all the wal files and resume.
> I have to do this a few times a day as it takes about 10 hours to get 20GB
> of wal files in the pg_xlog directory (what fills my partion on the ssd).
>
> Any help with this issue would be greatly appreciated as I don't see what
> else I can try.
>
> Michal
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Ian Lawrence Barwick 2013-10-03 22:40:54 Re: Comparing Oracle vs Postgres Fwd: [ADMIN] Database replication to 50-80 nodes
Previous Message Michal TOMA 2013-10-03 21:56:35 pg_xlog size growing untill it fills the partition