"Leaking" disk space on FreeBSD servers

From: Dan Thomas <godders(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: "Leaking" disk space on FreeBSD servers
Date: 2013-03-20 11:49:07
Message-ID: CAG8duQ2OaP9aVY4p5f6pfe6VxHwryaeYtA7FEK6z19iNZtwBOA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi Guys,

We're seeing a problem with some of our FreeBSD/PostgreSQL servers
"leaking" quite significant amounts of disk space:

> df -h /usr/local/pgsql/
Filesystem Size Used Avail Capacity Mounted on
/dev/mfid1s1d 1.1T 772G 222G 78% /usr/local/pgsql

> du -sh /usr/local/pgsql/
741G /usr/local/pgsql/

Stopping Postgres doesn't fix it, but rebooting does which points at the OS
rather than PG to me. However, the leak is only apparent in the dedicated
pgsql partition, and only on our database servers, so PostgreSQL seems to
at least be involved. The partition itself is a relatively standard UFS
partition:

> grep /usr/local/pgsql /etc/fstab
/dev/mfid1s1d /usr/local/pgsql ufs rw 2 2

> tunefs -p /usr/local/pgsql/
tunefs: POSIX.1e ACLs: (-a) disabled
tunefs: NFSv4 ACLs: (-N) disabled
tunefs: MAC multilabel: (-l) disabled
tunefs: soft updates: (-n) enabled
tunefs: gjournal: (-J) disabled
tunefs: trim: (-t) disabled
tunefs: maximum blocks per file in a cylinder group: (-e) 2048
tunefs: average file size: (-f) 16384
tunefs: average number of files in a directory: (-s) 64
tunefs: minimum percentage of free space: (-m) 8%
tunefs: optimization preference: (-o) time
tunefs: volume label: (-L)

LSOF isn't showing any open files:

> lsof +L /usr/local/pgsql/ | awk '{ print $8 }' | grep 0 | wc -l
0

We're not creating filesystem snapshots:

> find /usr/local/pgsql/ -flags snapshot
>

Not all of our servers are leaking space, it's only the more
recently-installed systems. Here's a quick breakdown of versions:

FreeBSD PostgreSQL Leaking?
8.0 8.4.4 no
8.2 9.0.4 no
8.3 9.1.4 yes
8.3 9.2.3 yes
9.1 9.2.3 yes

Each of these servers is configured with a warm standby, so we've been
switching them over to the standby to reclaim the space (rebooting the
primary is too much downtime). The standby does *not* demonstrate this
problem while it's being used as a standby, but it starts leaking space
once it's been made the primary.

Initially I thought this might be related to WAL files, however the pg_xlog
dir is symlinked outside of the /usr/local/pgsql partition that is
demonstrating this problem:

> ll /usr/local/pgsql/data/pg_xlog
lrwxr-xr-x 25B Oct 19 10:48 pg_xlog -> /usr/local/pglog/pg_xlog/

I've exhausted everything I can think of to try to solve this one. Has
anyone got any ideas on how to go about debugging this?

Thanks,

Dan

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Achilleas Mantzios 2013-03-20 12:30:42 Re: "Leaking" disk space on FreeBSD servers
Previous Message jg 2013-03-20 11:13:07 File Fragmentation