From: | Andres Freund <andres(at)2ndquadrant(dot)com> |
---|---|
To: | Patrick Krecker <patrick(at)judicata(dot)com> |
Cc: | Chris Hundt <chris(at)judicata(dot)com>, Craig Ringer <craig(at)2ndquadrant(dot)com>, pgsql-general(at)postgresql(dot)org |
Subject: | Re: WAL receive process dies |
Date: | 2014-08-29 22:46:38 |
Message-ID: | 20140829224638.GK10109@awork2.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
[FWIW: proper quoting makes answering easier and thus more likely]
On 2014-08-29 15:37:51 -0700, Patrick Krecker wrote:
> I ran the following on the local endpoint of spiped:
>
> while [ true ]; do psql -h localhost -p 5445 judicata -U marbury -c "select
> modtime, pg_last_xlog_receive_location(), pg_last_xlog_replay_location()
> from replication_time"; done;
>
> And the same command on production and I was able to verify that the xlogs
> for a given point in time were the same (modtime is updated every second by
> an upstart job):
>
> spiped from office -> production:
> modtime | pg_last_xlog_receive_location |
> pg_last_xlog_replay_location
> ----------------------------+-------------------------------+------------------------------
> 2014-08-29 15:23:25.563766 | 177/2E80C9F8 | 177/2E80C9F8
>
> Ran directly on production replica:
> modtime | pg_last_xlog_receive_location |
> pg_last_xlog_replay_location
> ----------------------------+-------------------------------+------------------------------
> 2014-08-29 15:23:25.563766 | 177/2E80C9F8 | 177/2E80C9F8
>
> To me, this is sufficient proof that spiped is indeed talking to the
> machine I think it's talking to (also lsof reports the correct hostname).
>
> I created another basebackup from the currently stuck postgres intance on
> another machine and I also get this error:
>
> 2014-08-29 15:27:30 PDT FATAL: could not receive data from WAL stream:
> ERROR: requested starting point 177/2D000000 is ahead of the WAL flush
> position of this server 174/B76D16A8
Uh. this indicates that the machine you're talking to is *not* one of
the above as it has a flush position of '174/B76D16A8' - not something
that's really possible when the node actually is at '177/2E80C9F8'.
Could you run, on the standby that's having problems, the following
command:
psql 'host=127.0.0.1 port=5445 user=XXX password=XXX' -c 'IDENTIFY_SYSTEM;'
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Patrick Krecker | 2014-08-29 23:26:13 | Re: WAL receive process dies |
Previous Message | Patrick Krecker | 2014-08-29 22:37:51 | Re: WAL receive process dies |