From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Peter Eisentraut <peter_e(at)gmx(dot)net> |
Cc: | Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
Subject: | Re: pg_rewind exiting with error code 1 when source and target are on the same timeline |
Date: | 2015-12-14 23:11:15 |
Message-ID: | 937.1450134675@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
> On 12/3/15 11:10 PM, Michael Paquier wrote:
>> On Fri, Dec 4, 2015 at 12:22 PM, Peter Eisentraut <peter_e(at)gmx(dot)net> wrote:
>>> After playing with this a bit, I think your patch is correct. The code
>>> has drifted a bit in the meantime, so attached is an updated patch.
>> Thanks for looking at it.
> I committed this to master. It's also on the 9.5 open item list, but if
> I backport it then the tests don't pass. Still looking. Not sure yet
> if this is because of code changes in pg_rewind master or test
> infrastructure changes in master.
I poked into this and found that the problem is that 9.5 is lacking the
hunks of commit e50cda78 that teach sanityChecks() to allow the control
file state to be DB_SHUTDOWNED_IN_RECOVERY, to wit
@@ -374,10 +380,11 @@ sanityChecks(void)
/*
* Target cluster better not be running. This doesn't guard against
* someone starting the cluster concurrently. Also, this is probably more
- * strict than necessary; it's OK if the master was not shut down cleanly,
- * as long as it isn't running at the moment.
+ * strict than necessary; it's OK if the target node was not shut down
+ * cleanly, as long as it isn't running at the moment.
*/
- if (ControlFile_target.state != DB_SHUTDOWNED)
+ if (ControlFile_target.state != DB_SHUTDOWNED &&
+ ControlFile_target.state != DB_SHUTDOWNED_IN_RECOVERY)
pg_fatal("target server must be shut down cleanly\n");
/*
@@ -385,75 +392,149 @@ sanityChecks(void)
* server is shut down. There isn't any very strong reason for this
* limitation, but better safe than sorry.
*/
- if (datadir_source && ControlFile_source.state != DB_SHUTDOWNED)
+ if (datadir_source &&
+ ControlFile_source.state != DB_SHUTDOWNED &&
+ ControlFile_source.state != DB_SHUTDOWNED_IN_RECOVERY)
pg_fatal("source data directory must be shut down cleanly\n");
}
(Actually, it's only the second of these that is critical to make the
test pass, but I should think we should apply both of them if either.)
If I apply these, without any of the rest of e50cda78, everything seems
fine. I'm going to go ahead and push that in the interests of getting
some buildfarm cycles on it; but if someone could confirm that this
is not an insane thing to do, it'd help.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2015-12-15 00:14:04 | Re: pg_rewind exiting with error code 1 when source and target are on the same timeline |
Previous Message | ryan | 2015-12-14 22:47:29 | BUG #13818: PostgreSQL crashes after cronjob runs as "postgres" |