Re: pg_basebackup bug: base backup is double the size of the database

From: Craig James <cjames(at)emolecules(dot)com>
To: "pgsql-admin(at)postgresql(dot)org" <pgsql-admin(at)postgresql(dot)org>
Subject: Re: pg_basebackup bug: base backup is double the size of the database
Date: 2015-01-21 17:45:12
Message-ID: CAFwQ8rdqPrRFX7rAXuip5GdRb=HmLwK_9XVsYKqD0_TDAw2T3g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

One clarification:

On Wed, Jan 21, 2015 at 9:32 AM, Craig James <cjames(at)emolecules(dot)com> wrote:

> We've encountered a serious bug with pg_basebackup. It seems to be
> following hard links and duplicating all files in the tablespaces rather
> than preserving links.
>

It could be barman, not pg_basebackup, that has the bug. I assumed that
barman was using pg_basebackup, but one of my colleagues pointed out that
we're using barman, which may or may not use pg_basebackup to do its work.

Craig

>
> Drilling down into one specific tablespace, we find this:
>
> # ls -l /data/postgres-9.3/main/pg_tblspc/16747
> lrwxrwxrwx 1 postgres postgres 27 2014-08-18 11:28
> /data/postgres-9.3/main/pg_tblspc/16747 -> /postgres/tablespaces/uorsy/
>
> # du -sh /data/postgres-9.3/tablespaces/uorsy
> *35G* /data/postgres-9.3/tablespaces/uorsy
>
> # du -sh /data/postgres-9.3/tablespaces/uorsy/*
> *35G* /data/postgres-9.3/tablespaces/uorsy/8208624
> *8.1M* /data/postgres-9.3/tablespaces/uorsy/PG_9.3_201306121
> 4.0K /data/postgres-9.3/tablespaces/uorsy/pgsql_tmp
> 4.0K /data/postgres-9.3/tablespaces/uorsy/PG_VERSION
>
> # find /data/postgres-9.3/tablespaces/uorsy \! -links 1 -type f | wc -l
> *740*
>
>
> In other words, this tablespace has 35G of real data, plus 740 hard links
> that effectively duplicate each data file.
>
> When we look at the same data in the archive that pg_basebackup creates
> (invoked via barman), we find this:
>
> # du -sh /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747
> *70G* /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747
>
> # du -sh /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/*
> *35G*
> /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/8208624
> *35G*
> /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/PG_9.3_201306121
> 4.0K
> /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/pgsql_tmp
> 4.0K
> /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747/PG_VERSION
>
> # find /pg_archive/staging/base/20150114T170002/pgdata/pg_tblspc/16747 \!
> -links 1 -type f | wc -l
> *0*
>
>
> That is, no hard links, and all of the data files are duplicated. And of
> course, when we try to actually use this archive to recover, it's twice the
> size as the original database and doesn't fit on our disks.
>
> My guess is that pg_basebackup is using (or doing the equivalent of)
> rsync(1) without the --hard-links option, and that these hard links were
> created by pg_upgrade when we went from 8.4.17 to 9.3.5.
>
> What can we do to fix this? The whole cluster is about 350 databases and
> 800GB.
>
> Thanks,
> Craig
>
>

--
---------------------------------
Craig A. James
Chief Technology Officer
eMolecules, Inc.
---------------------------------

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Matheus de Oliveira 2015-01-21 18:19:47 Re: pg_basebackup bug: base backup is double the size of the database
Previous Message Craig James 2015-01-21 17:32:06 pg_basebackup bug: base backup is double the size of the database