From: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> |
---|---|
To: | Satoshi Nagayasu <snaga(at)uptime(dot)jp>, PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org> |
Subject: | Re: pg_rewind in contrib |
Date: | 2014-12-16 09:37:33 |
Message-ID: | 548FFD5D.80703@vmware.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 12/16/2014 11:23 AM, Satoshi Nagayasu wrote:
> Hi,
>
> On 2014/12/12 23:13, Heikki Linnakangas wrote:
> > Hi,
> >
> > I'd like to include pg_rewind in contrib. I originally wrote it as an
> > external project so that I could quickly get it working with the
> > existing versions, and because I didn't feel it was quite ready for
> > production use yet. Now, with the WAL format changes in master, it is a
> > lot more maintainable than before. Many bugs have been fixed since the
> > first prototypes, and I think it's fairly robust now.
> >
> > I propose that we include pg_rewind in contrib/ now. Attached is a patch
> > for that. It just includes the latest sources from the current pg_rewind
> > repository at https://github.com/vmware/pg_rewind. It is released under
> > the PostgreSQL license.
> >
> > For those who are not familiar with pg_rewind, it's a tool that allows
> > repurposing an old master server as a new standby server, after
> > promotion, even if the old master was not shut down cleanly. That's a
> > very often requested feature.
>
> I'm looking into pg_rewind with a very first scenario.
> My scenario is here.
>
> https://github.com/snaga/pg_rewind_test/blob/master/pg_rewind_test.sh
>
> At least, I think a file descriptor "srcf" should be closed before
> exiting copy_file_range(). I got "can't open file" error with
> "too many open file" while running pg_rewind.
>
> ------------------------------------------------
> diff --git a/contrib/pg_rewind/copy_fetch.c b/contrib/pg_rewind/copy_fetch.c
> index bea1b09..5a8cc8e 100644
> --- a/contrib/pg_rewind/copy_fetch.c
> +++ b/contrib/pg_rewind/copy_fetch.c
> @@ -280,6 +280,8 @@ copy_file_range(const char *path, off_t begin, off_t
> end, bool trunc)
> write_file_range(buf, begin, readlen);
> begin += readlen;
> }
> +
> + close(srcfd);
> }
>
> /*
> ------------------------------------------------
Yep, good catch. I pushed a fix to the pg_rewind repository at github.
> And I have one question here.
>
> pg_rewind assumes that the source PostgreSQL has, at least, one
> checkpoint after getting promoted. I think the target timeline id
> in the pg_control file to be read is only available after the first
> checkpoint. Right?
Yes, it does assume that the source server (= old standby, new master)
has had at least one checkpoint after promotion. It probably should be
more explicit about it: If there hasn't been a checkpoint, you will
currently get an error "source and target cluster are both on the same
timeline", which isn't very informative.
I assume that by "target timeline ID" you meant the timeline ID of the
source server, i.e. the timeline that the target server should be
rewound to.
- Heikki
From | Date | Subject | |
---|---|---|---|
Next Message | Timmer, Marius | 2014-12-16 09:52:13 | Re: [PATCH] explain sortorder |
Previous Message | Mark Cave-Ayland | 2014-12-16 09:36:12 | Re: Commitfest problems |