Re: Postgresql 9.2 has standby server lost data?

From: Paula Price <pj8abug(at)gmail(dot)com>
To: Jerry Sievers <gsievers19(at)comcast(dot)net>
Cc: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Postgresql 9.2 has standby server lost data?
Date: 2015-06-23 23:49:55
Message-ID: CAL3+i0dFuaSm28Lx_W2Z+5JDT3wAfg66T=MO8b1yRPNtaDia=g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

​The script I used to check the lag time between the primary and the
standby would show that the standby server was not even close, right?

Paula​

On Sat, Jun 20, 2015 at 9:51 AM, Jerry Sievers <gsievers19(at)comcast(dot)net>
wrote:

> Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com> writes:
>
> > On 06/19/2015 01:05 PM, Paula Price wrote:
> >
> >>
> >>
> >> On Fri, Jun 19, 2015 at 12:01 PM, Adrian Klaver
> >> <adrian(dot)klaver(at)aklaver(dot)com <mailto:adrian(dot)klaver(at)aklaver(dot)com>> wrote:
> >>
> >> On 06/18/2015 05:45 PM, Paula Price wrote:
> >>
> >> I have Postgresql 9.2.10 streaming replication set up with log
> >> shipping in
> >> case the replication falls behind. I discovered that the
> >> log-shipping had
> >> been disabled at some point in time. I enabled the log shipping
> >> again.
> >>
> >> If at some point in time the streaming replication fell behind
> >> and the
> >> standby server was not able to retrieve the necessary WAL
> >> file(s) from the
> >> primary, would the standby server continue to function
> >> normally? Do I need
> >> to rebuild the standby server? I have restarted the standby
> >> server and it
> >> is up and running with no issues.
> >>
> >>
> >> Well that seems at odds with it being unable to retrieve the WAL
> >> files. This leads to these questions:
> >>
> >> 1) What makes you think it did not retrieve the WAL files via
> streaming?
> >>
> >> ​
> >> It _may_ _not _have fallen behind via replication. We do have standby
> >> servers that fall behind, but since we have log-shipping it is not a
> >> concern. On this server, i have no idea how long we were running
> >> without log-shipping. I have no idea how many log files I would have to
> >> go through to find out when log-shipping stopped.
> >> My basic question is:
> >> If a standby server falls behind with streaming replication AND the
> >> standby server cannot obtain the WAL file needed from the primary, will
> >> you get an error from the standby server? Or does it just hiccup and
> >> try to carry on?​
> >
> > No it will fall over:
>
> I wouldn't describe it that way...
>
> To a user, the standby will function and appear normal, unless they
> notice that the data is not current.
>
> In the server logs, there will be indications that replication is stuck
> waiting for WAL.
>
> HTH
>
> >
> http://www.postgresql.org/docs/9.2/interactive/runtime-config-replication.html#RUNTIME-CONFIG-REPLICATION-SENDER
> >
> > wal_keep_segments (integer)
> >
> > Specifies the minimum number of past log file segments kept in the
> > pg_xlog directory, in case a standby server needs to fetch them for
> > streaming replication. Each segment is normally 16 megabytes. If a
> > standby server connected to the sending server falls behind by more
> > than wal_keep_segments segments, the sending server might remove a WAL
> > segment still needed by the standby, in which case the replication
> > connection will be terminated. Downstream connections will also
> > eventually fail as a result. (However, the standby server can recover
> > by fetching the segment from archive, if WAL archiving is in use.)
> >
> > This sets only the minimum number of segments retained in pg_xlog;
> > the system might need to retain more segments for WAL archival or to
> > recover from a checkpoint. If wal_keep_segments is zero (the default),
> > the system doesn't keep any extra segments for standby purposes, so
> > the number of old WAL segments available to standby servers is a
> > function of the location of the previous checkpoint and status of WAL
> > archiving. This parameter can only be set in the postgresql.conf file
> > or on the server command line.
> >
> > When you started up if the necessary WAL files where not on the server
> > you would have seen Postgres throwing errors in the log.
> >
> > I would check out the below to verify:
> >
> >
> http://www.postgresql.org/docs/9.2/interactive/warm-standby.html#STREAMING-REPLICATION
> >
> > 25.2.5.2. Monitoring
> >
> >>
> >>
> >> 2) What does the postgres log show at the time you restarted the
> >> standby?
> >>
> >> ​2015-06-18 01:12:41.871 UTC::@:[28168]:2015-06-18 01:12:41
> UTC:
> >> LOG: 00000: database system was shut down in recovery at
> >> 2015-06-18 01:12:14 UTC
> >>
> >> 2015-06-18 01:12:41.871 UTC::@:[28168]:2015-06-18 01:12:41 UTC:
> >> LOCATION: StartupXLOG, xlog.c:6298
> >>
> >> 2015-06-18 01:12:41.904 UTC::@:[28168]:2015-06-18 01:12:41 UTC:
> >> LOG: 00000: entering standby mode
> >>
> >> 2015-06-18 01:12:41.904 UTC::@:[28168]:2015-06-18 01:12:41 UTC:
> >> LOCATION: StartupXLOG, xlog.c:6384
> >>
> >> 2015-06-18 01:12:41.987 UTC::@:[28168]:2015-06-18 01:12:41 UTC:
> >> LOG: 00000: redo starts at 867/FDF32E18
> >>
> >> 2015-06-18 01:12:41.987 UTC::@:[28168]:2015-06-18 01:12:41 UTC:
> >> LOCATION: StartupXLOG, xlog.c:6855
> >>
> >> 2015-06-18 01:12:42.486
> >> UTC::[unknown](at)[unknown]:[28213]:2015-06-18 01:12:42 UTC: LOG:
> >> 00000: connection received: host=[local]
> >>
> >> 2015-06-18 01:12:42.486
> >> UTC::[unknown](at)[unknown]:[28213]:2015-06-18 01:12:42 UTC:
> >> LOCATION: BackendInitialize, postmaster.c:3501
> >>
> >> 2015-06-18 01:12:42.486
> >> UTC:[local]:postgres(at)postgres:[28213]:2015-06-18 01:12:42 UTC:
> >> FATAL: 57P03: the database system is starting up
> >>
> >> 2015-06-18 01:12:42.486
> >> UTC:[local]:postgres(at)postgres:[28213]:2015-06-18 01:12:42 UTC:
> >> LOCATION: ProcessStartupPacket, postmaster.c:1792
> >>
> >> 2015-06-18 01:12:43.488
> >> UTC::[unknown](at)[unknown]:[28270]:2015-06-18 01:12:43 UTC: LOG:
> >> 00000: connection received: host=[local]
> >>
> >> 2015-06-18 01:12:43.488
> >> UTC::[unknown](at)[unknown]:[28270]:2015-06-18 01:12:43 UTC:
> >> LOCATION: BackendInitialize, postmaster.c:3501
> >>
> >> 2015-06-18 01:12:43.488
> >> UTC:[local]:postgres(at)postgres:[28270]:2015-06-18 01:12:43 UTC:
> >> FATAL: 57P03: the database system is starting up
> >>
> >> 2015-06-18 01:12:43.488
> >> UTC:[local]:postgres(at)postgres:[28270]:2015-06-18 01:12:43 UTC:
> >> LOCATION: ProcessStartupPacket, postmaster.c:1792
> >>
> >> 2015-06-18 01:12:44.489
> >> UTC::[unknown](at)[unknown]:[28327]:2015-06-18 01:12:44 UTC: LOG:
> >> 00000: connection received: host=[local]
> >>
> >> 2015-06-18 01:12:44.489
> >> UTC::[unknown](at)[unknown]:[28327]:2015-06-18 01:12:44 UTC:
> >> LOCATION: BackendInitialize, postmaster.c:3501
> >>
> >> 2015-06-18 01:12:44.489
> >> UTC:[local]:postgres(at)postgres:[28327]:2015-06-18 01:12:44 UTC:
> >> FATAL: 57P03: the database system is starting up
> >>
> >> 2015-06-18 01:12:44.489
> >> UTC:[local]:postgres(at)postgres:[28327]:2015-06-18 01:12:44 UTC:
> >> LOCATION: ProcessStartupPacket, postmaster.c:1792
> >>
> >> 2015-06-18 01:12:44.490 UTC::@:[28168]:2015-06-18 01:12:41 UTC:
> >> LOG: 00000: consistent recovery state reached at 868/112AF7F8
> >>
> >> 2015-06-18 01:12:44.490 UTC::@:[28168]:2015-06-18 01:12:41 UTC:
> >> LOCATION: CheckRecoveryConsistency, xlog.c:7405
> >>
> >> 2015-06-18 01:12:44.490 UTC::@:[28168]:2015-06-18 01:12:41 UTC:
> >> LOG: 00000: invalid record length at 868/112AFB00
> >>
> >> 2015-06-18 01:12:44.490 UTC::@:[28168]:2015-06-18 01:12:41 UTC:
> >> LOCATION: ReadRecord, xlog.c:4078
> >>
> >> 2015-06-18 01:12:44.490 UTC::@:[28166]:2015-06-18 01:12:41 UTC:
> >> LOG: 00000: database system is ready to accept read only
> >> connections
> >>
> >> 2015-06-18 01:12:44.490 UTC::@:[28166]:2015-06-18 01:12:41 UTC:
> >> LOCATION: sigusr1_handler, postmaster.c:4314
> >>
> >> ​
> >>
> >> I need to know if the
> >>
> >> data integrity has been compromised.
> >>
> >> I have run this query to determine the lag time for the
> >> standby(in case
> >> this tells me anything):
> >> "SELECT now(), now() - pg_last_xact_replay_timestamp() AS
> time_lag;
> >> RESULT:
> >> "2015-06-19 00:40:48.83701+00";"00:00:01.078616"
> >>
> >>
> >> Thank you,
> >> Paula P
> >>
> >>
> >>
> >> --
> >> Adrian Klaver
> >> adrian(dot)klaver(at)aklaver(dot)com <mailto:adrian(dot)klaver(at)aklaver(dot)com>
> >>
> >>
> >
> >
> > --
> > Adrian Klaver
> > adrian(dot)klaver(at)aklaver(dot)com
>
> --
> Jerry Sievers
> Postgres DBA/Development Consulting
> e: postgres(dot)consulting(at)comcast(dot)net
> p: 312.241.7800
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Bret Stern 2015-06-24 01:58:14 Re: native api or odbc?
Previous Message Ted Toth 2015-06-23 22:56:41 native api or odbc?