From: | Stephen Frost <sfrost(at)snowman(dot)net> |
---|---|
To: | Jim Nasby <Jim(dot)Nasby(at)BlueTreble(dot)com> |
Cc: | Bruce Momjian <bruce(at)momjian(dot)us>, Andres Freund <andres(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org>, Josh Berkus <josh(at)agliodbs(dot)com> |
Subject: | Re: pg_upgrade and rsync |
Date: | 2015-01-23 18:40:55 |
Message-ID: | 20150123184055.GE3854@tamriel.snowman.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
* Jim Nasby (Jim(dot)Nasby(at)BlueTreble(dot)com) wrote:
> On 1/22/15 7:54 PM, Stephen Frost wrote:
> >* Bruce Momjian (bruce(at)momjian(dot)us) wrote:
> >>>On Fri, Jan 23, 2015 at 01:19:33AM +0100, Andres Freund wrote:
> >>>> >Or do you - as the text edited in your patch, but not the quote above -
> >>>> >mean to run pg_upgrade just on the primary and then rsync?
> >>>
> >>>No, I was going to run it on both, then rsync.
> >I'm pretty sure this is all a lot easier than you believe it to be. If
> >you want to recreate what pg_upgrade does to a cluster then the simplest
> >thing to do is rsync before removing any of the hard links. rsync will
> >simply recreate the same hard link tree that pg_upgrade created when it
> >ran, and update files which were actually changed (the catalog tables).
> >
> >The problem, as mentioned elsewhere, is that you have to checksum all
> >the files because the timestamps will differ. You can actually get
> >around that with rsync if you really want though- tell it to only look
> >at file sizes instead of size+time by passing in --size-only.
>
> What if instead of trying to handle that on the rsync side, we changed pg_upgrade so that it created hardlinks that had the same timestamp as the original file?
So, two things, I chatted w/ Bruce and he was less concerned about the
lack of being able to match up the timestamps than I was. He has a
point though- the catalog tables are going to get copied anyway since
they won't be hard links and checking that all the other files match in
size and that both the master and the standby are at the same xlog
position should give you a pretty good feeling that everything matches
up sufficiently.
Second, I don't follow what you mean by having pg_upgrade change the
hardlinks to have the same timestamp- for starters, the timestamp is in
the inode and not the actual hard link (two files hard linked together
won't have different timestamps..) and second, the problem isn't on the
master side- it's on the standby side. The standby's files will have
timestamps different from the master and there really isn't much to be
done about that.
> That said, the whole timestamp race condition in rsync gives me the heebie-jeebies. For normal workloads maybe it's not that big a deal, but when dealing with fixed-size data (ie: Postgres blocks)? Eww.
The race condition is a problem for pg_start/stop_backup and friends.
In this instance, everything will be shut down when the rsync is
running, so there isn't a timestamp race condition to worry about.
> How horribly difficult would it be to allow pg_upgrade to operate on multiple servers? Could we have it create a shell script instead of directly modifying things itself? Or perhaps some custom "command file" that could then be replayed by pg_upgrade on another server? Of course, that's assuming that replicas are compatible enough with masters for that to work...
Yeah, I had suggested that to Bruce also, but it's not clear why that
would be any different from an rsync --size-only in the end, presuming
everything went according to plan.
Thanks,
Stephen
From | Date | Subject | |
---|---|---|---|
Next Message | Jim Nasby | 2015-01-23 18:44:23 | Re: Parallel Seq Scan |
Previous Message | Jim Nasby | 2015-01-23 18:34:57 | Re: pg_upgrade and rsync |