Re: pg_upgrade + replica servers + rsync --size-only is unsafe

From: Rob Emery <re-pgsql(at)codeweavers(dot)net>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: pgsql-admin <pgsql-admin(at)postgresql(dot)org>
Subject: Re: pg_upgrade + replica servers + rsync --size-only is unsafe
Date: 2020-12-03 11:41:52
Message-ID: CAPCETptm3cV_sr4uXgA_55Tc20RQ-1EN=fOBaH8+WhpRU_pZvw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Hello Stephen,

>Then you run the rsync, at the /srv level, and all those 'new files' in
>/srv/new_cluster on the primary should get copied over to
>/srv/new_cluster on the replica, while any files which are hard-linked
>between old_cluster and new_cluster should end up as hard links on the
>replica.
>
>Thanks,

Thanks very much for your reply.

We've re-ran our test upgrade this morning just to be certain and we
reproduced the same result once again, it appears that the issue is not
that the pg_controlfile is hardlinked, but that it exists at all.

The output of pg_controldata shows a different ID on the master and the
secondary as you suspected:

Master:

$ ./pg_controldata /var/lib/postgresql/9.6/main
pg_control version number: 960
Catalog version number: 201608131
Database system identifier: 6901965895850147745
Database cluster state: in production
pg_control last modified: Thu 03 Dec 2020 10:11:21 GMT

Secondary:

$ ./pg_controldata /var/lib/postgresql/9.6/main
pg_control version number: 960
Catalog version number: 201608131
Database system identifier: 6901966223447496344
Database cluster state: shut down
pg_control last modified: Thu 03 Dec 2020 09:54:00 GMT

The pg_control file does exist on the secondary 9.6 prior to the rsync
because it is created by the initdb command (which is step 4. of the
instructions - https://www.postgresql.org/docs/9.6/pgupgrade.html; we
verified that by running `/usr/lib/postgresql/9.6/bin/initdb
/var/lib/postgresql/999/main` and it creates a pg_control file).

The pg_control files are both exactly 8192 bytes; removing the --size-only
option resolves this and causes rsync to copy the control file across,
which can be confirmed by the --dry-run option:

$ rsync --verbose --dry-run --archive --delete --hard-links
--no-inc-recursive /media/postgresql/data/main /media/postgresql/data/9.6
PGRETESTA02:/media/postgresql/data
building file list ... done

... <snip>

9.6/main/global/pg_control

... <snip>

sent 252,838 bytes received 666 bytes 507,008.00 bytes/sec
total size is 12,061,021,649 speedup is 47,577.24 (DRY RUN)

So it seems like setting --size-only guarantees disaster in this flow?
However we are definitely confused as to why this isn't hit more often, as
it seems that this would always happen?

I've attached the following that might clarify:
- raw output of the rsync that we ran during step 12.a of our instructions
(master-rsync.txt)
- pg_controldata raw output from both master and secondary once the error
occurs (master-pgcontroldata.txt and secondary-pgcontroldata.txt)
- A copy of our internal instructions with the commands that we're using at
each step to build our script. We've based this on the instructions at
https://www.postgresql.org/docs/9.6/pgupgrade.html with some tweaks for our
configuration.

Some things to note about our setup which I'm adding for completeness as it
may have a bearing on the result we're seeing:
- the 9.5 database is currently in /media/postgresql/data/main and we are
moving to a /media/postgresql/data/main/$version structure to make it
easier for upgrades as part of this move (hence why the rsync command isn't
'symmetrical' in the arguments)
- additionally we symlink the /var/lib/postgresql/$version/main directory
off to that location.
- We're using Debian 9 and the postgresql apt packages
- As we are using Debian, the configuration for the postgresql DB's lives
in /etc/postgresql/$version/main

Many Thanks
Rob

On Wed, 2 Dec 2020 at 17:28, Stephen Frost <sfrost(at)snowman(dot)net> wrote:

> Greetings,
>
> * Rob Emery (re-pgsql(at)codeweavers(dot)net) wrote:
> > We're pretty sure that we've just hit a scenario where the instructions
> for
> > pg_upgrade with standby servers aren't quite safe.
>
> I've pretty much always felt those instructions should come with a
> "expert use only" note.
>
> > We were testing our upgrade process with a copy of our live cluster in a
> > lab environment and we found that using
> >
> > rsync --verbose --archive --delete --hard-links --size-only
> > --no-inc-recursive /media/postgresql/data/main /media/postgresql/data/9.6
> > PGRETESTA02:/media/postgresql/data
> >
> > Resulted in:
> >
> > 2020-12-02 14:49:11.513 GMT [20884-1] LOG: database system was shut
> > down in recovery at 2020-12-02 13:57:56 GMT
> > 2020-12-02 14:49:11.513 GMT [20884-2] LOG: entering standby mode
> > 2020-12-02 14:49:11.557 GMT [20884-3] LOG: consistent recovery state
> > reached at 19E/25000098
> > 2020-12-02 14:49:11.557 GMT [20884-4] LOG: invalid record length at
> > 19E/25000098: wanted 24, got 0
> > 2020-12-02 14:49:11.559 GMT [20883-1] LOG: database system is ready
> > to accept read only connections
> > 2020-12-02 14:49:11.593 GMT [20888-1] FATAL: database system
> > identifier differs between the primary and standby
> > 2020-12-02 14:49:11.593 GMT [20888-2] DETAIL: The primary's
> > identifier is 6901669428825624285, the standby's identifier is
> > 690161835164
> > 1138930.
> >
> > however if we don't use --size-only, then the process worked fine and we
> a
> > smooth upgrade.
>
> This doesn't look like a --size-only issue though- how did you end up
> with this when the pg_controldata file (where the system identifier is
> pulled from) shouldn't even exist and won't be a hardlink to the
> existing one on the old system since it's a new cluster, and therefore
> should definitely be copied?
>
> When you go to run the rsync to get the replica up to date you should
> have a system that looks like this:
>
> PRIMARY:
>
> /srv/old_cluster
> ... lots of files
> /srv/new_cluster
> ... lots of *new* files, including pg_controldata
> ... hardlinks to PG table/index files that aren't part of the catalog
>
> REPLICA:
>
> /srv/old_cluster
> ... lots of files
> /srv/new_cluster
> ... entirely empty
>
> Then you run the rsync, at the /srv level, and all those 'new files' in
> /srv/new_cluster on the primary should get copied over to
> /srv/new_cluster on the replica, while any files which are hard-linked
> between old_cluster and new_cluster should end up as hard links on the
> replica.
>
> Thanks,
>
> Stephen
>

--
Robert Emery
Infrastructure Director

01785 711633

<> Codeweavers

Phone: 0800 021 0888
Website: codeweavers.net

Barn 4, Dunston Business Village, ST18 9AB. Registered in England and
Wales No. 04092394 VAT registration no. 974 9705 63

--

Attachment Content-Type Size
master-pgcontroldata.txt text/plain 2.3 KB
run-sheet.txt text/plain 3.0 KB
secondary-pgcontroldata.txt text/plain 2.3 KB
master-rsync.txt text/plain 82.1 KB

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Stephen Frost 2020-12-03 13:10:53 Re: pg_upgrade + replica servers + rsync --size-only is unsafe
Previous Message Atul Kumar 2020-12-03 08:49:24 time taking deletion on large tables