Re: Postgresql 9.2 has standby server lost data?

From: Paula Price <pj8abug(at)gmail(dot)com>
To: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Postgresql 9.2 has standby server lost data?
Date: 2015-06-19 20:05:41
Message-ID: CAL3+i0d6dNmiNuqJaLYHcYT+gZUXmWmYC9M6tC6H3yVvhHmhMg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Fri, Jun 19, 2015 at 12:01 PM, Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
wrote:

> On 06/18/2015 05:45 PM, Paula Price wrote:
>
>> I have Postgresql 9.2.10 streaming replication set up with log shipping in
>> case the replication falls behind. I discovered that the log-shipping had
>> been disabled at some point in time. I enabled the log shipping again.
>>
>> If at some point in time the streaming replication fell behind and the
>> standby server was not able to retrieve the necessary WAL file(s) from the
>> primary, would the standby server continue to function normally? Do I
>> need
>> to rebuild the standby server? I have restarted the standby server and it
>> is up and running with no issues.
>>
>
> Well that seems at odds with it being unable to retrieve the WAL files.
> This leads to these questions:
>
> 1) What makes you think it did not retrieve the WAL files via streaming?


It *may* *not *have fallen behind via replication. We do have standby
servers that fall behind, but since we have log-shipping it is not a
concern. On this server, i have no idea how long we were running without
log-shipping. I have no idea how many log files I would have to go through
to find out when log-shipping stopped.
My basic question is:
If a standby server falls behind with streaming replication AND the standby
server cannot obtain the WAL file needed from the primary, will you get an
error from the standby server? Or does it just hiccup and try to carry on?​

>
> 2) What does the postgres log show at the time you restarted the standby?

​2015-06-18 01:12:41.871 UTC::@:[28168]:2015-06-18 01:12:41 UTC: LOG:
> 00000: database system was shut down in recovery at 2015-06-18 01:12:14 UTC
>
2015-06-18 01:12:41.871 UTC::@:[28168]:2015-06-18 01:12:41 UTC: LOCATION:
> StartupXLOG, xlog.c:6298
>
2015-06-18 01:12:41.904 UTC::@:[28168]:2015-06-18 01:12:41 UTC: LOG:
> 00000: entering standby mode
>
2015-06-18 01:12:41.904 UTC::@:[28168]:2015-06-18 01:12:41 UTC: LOCATION:
> StartupXLOG, xlog.c:6384
>
2015-06-18 01:12:41.987 UTC::@:[28168]:2015-06-18 01:12:41 UTC: LOG:
> 00000: redo starts at 867/FDF32E18
>
2015-06-18 01:12:41.987 UTC::@:[28168]:2015-06-18 01:12:41 UTC: LOCATION:
> StartupXLOG, xlog.c:6855
>
2015-06-18 01:12:42.486 UTC::[unknown](at)[unknown]:[28213]:2015-06-18
> 01:12:42 UTC: LOG: 00000: connection received: host=[local]
>
2015-06-18 01:12:42.486 UTC::[unknown](at)[unknown]:[28213]:2015-06-18
> 01:12:42 UTC: LOCATION: BackendInitialize, postmaster.c:3501
>
2015-06-18 01:12:42.486 UTC:[local]:postgres(at)postgres:[28213]:2015-06-18
> 01:12:42 UTC: FATAL: 57P03: the database system is starting up
>
2015-06-18 01:12:42.486 UTC:[local]:postgres(at)postgres:[28213]:2015-06-18
> 01:12:42 UTC: LOCATION: ProcessStartupPacket, postmaster.c:1792
>
2015-06-18 01:12:43.488 UTC::[unknown](at)[unknown]:[28270]:2015-06-18
> 01:12:43 UTC: LOG: 00000: connection received: host=[local]
>
2015-06-18 01:12:43.488 UTC::[unknown](at)[unknown]:[28270]:2015-06-18
> 01:12:43 UTC: LOCATION: BackendInitialize, postmaster.c:3501
>
2015-06-18 01:12:43.488 UTC:[local]:postgres(at)postgres:[28270]:2015-06-18
> 01:12:43 UTC: FATAL: 57P03: the database system is starting up
>
2015-06-18 01:12:43.488 UTC:[local]:postgres(at)postgres:[28270]:2015-06-18
> 01:12:43 UTC: LOCATION: ProcessStartupPacket, postmaster.c:1792
>
2015-06-18 01:12:44.489 UTC::[unknown](at)[unknown]:[28327]:2015-06-18
> 01:12:44 UTC: LOG: 00000: connection received: host=[local]
>
2015-06-18 01:12:44.489 UTC::[unknown](at)[unknown]:[28327]:2015-06-18
> 01:12:44 UTC: LOCATION: BackendInitialize, postmaster.c:3501
>
2015-06-18 01:12:44.489 UTC:[local]:postgres(at)postgres:[28327]:2015-06-18
> 01:12:44 UTC: FATAL: 57P03: the database system is starting up
>
2015-06-18 01:12:44.489 UTC:[local]:postgres(at)postgres:[28327]:2015-06-18
> 01:12:44 UTC: LOCATION: ProcessStartupPacket, postmaster.c:1792
>
2015-06-18 01:12:44.490 UTC::@:[28168]:2015-06-18 01:12:41 UTC: LOG:
> 00000: consistent recovery state reached at 868/112AF7F8
>
2015-06-18 01:12:44.490 UTC::@:[28168]:2015-06-18 01:12:41 UTC: LOCATION:
> CheckRecoveryConsistency, xlog.c:7405
>
2015-06-18 01:12:44.490 UTC::@:[28168]:2015-06-18 01:12:41 UTC: LOG:
> 00000: invalid record length at 868/112AFB00
>
2015-06-18 01:12:44.490 UTC::@:[28168]:2015-06-18 01:12:41 UTC: LOCATION:
> ReadRecord, xlog.c:4078
>
2015-06-18 01:12:44.490 UTC::@:[28166]:2015-06-18 01:12:41 UTC: LOG:
> 00000: database system is ready to accept read only connections
>
2015-06-18 01:12:44.490 UTC::@:[28166]:2015-06-18 01:12:41 UTC: LOCATION:
> sigusr1_handler, postmaster.c:4314
>

>
> I need to know if the
>
>> data integrity has been compromised.
>>
>> I have run this query to determine the lag time for the standby(in case
>> this tells me anything):
>> "SELECT now(), now() - pg_last_xact_replay_timestamp() AS time_lag;
>> RESULT:
>> "2015-06-19 00:40:48.83701+00";"00:00:01.078616"
>>
>>
>> Thank you,
>> Paula P
>>
>
>
> --
> Adrian Klaver
> adrian(dot)klaver(at)aklaver(dot)com
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Julien Rouhaud 2015-06-19 20:17:16 Re: Is there any way to measure disk activity for each query?
Previous Message Tom Lane 2015-06-19 19:32:54 Re: Postgres SIGALRM timer