Re: Re: "missing chunk number XX for toast value YY in pg_toast ..." after pg_basebackup.

From: <fredrik(at)huitfeldt(dot)com>
To: "Adrian Klaver" <adrian(dot)klaver(at)aklaver(dot)com>
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Re: "missing chunk number XX for toast value YY in pg_toast ..." after pg_basebackup.
Date: 2016-03-09 18:41:53
Message-ID: 1457548913211.38205.2418@webmail7
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi Adrian,

thank you very much for your patience. I apologise for the missing information.

On 9 March 2016 16:13:00 +01:00, Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com> wrote:

> On 03/09/2016 04:56 AM, <fredrik(at)huitfeldt(dot)com> wrote:
>
> > Hi Adrian,
> >
> > thank you very much for your response.
> >
> > I ran the "VACUUM ANALYZE" command on the master node.
> >
> > Regarding log messages.
> >
> > Here is the contents of the log (excluding connections/disconnections):
> >
> Assuming the below is from the replica database.
> the "LOG: recovery was paused" message was indeed from the replica.

>
>
> >
> > 2016-02-22 02:30:08 GMT 24616 LOG: recovery has paused
> >
> So what happened to cause the above?
>
we automatically pause recovery on the replica before running pg_dump. This is in order to make certain that we get a consistent dump of the database.

>
> I am not seeing anything below that indicates the recovery started again.
> the reason why we do not see a matching "resume" is that the pg_dump failed and our error handling was insufficient.

>
>
> > 2016-02-22 02:30:08 GMT 24616 HINT: Execute pg_xlog_replay_resume() to
> > continue.
> > 2016-02-22 02:37:19 GMT 23859 DBNAME ERROR: missing chunk number 0 for
> > toast value 2747579 in pg_toast_22066
> > 2016-02-22 02:37:19 GMT 23859 DBNAME STATEMENT: COPY public.room_shape
> > (room_uuid, data) TO stdout;
> > 2016-02-22 02:37:41 GMT 2648 DBNAME LOG: could not receive data from
> > client: Connection reset by peer
> > 2016-02-22 02:37:41 GMT 2648 DBNAME LOG: unexpected EOF on client
> > connection
> >
> What does the log from the master show?
>
It doesnt seem to show much. It does have these repeated messages, however:
2016-02-22 02:12:18 GMT 30908 LOG: using stale statistics instead of current ones because stats collector is not responding
2016-02-22 02:13:01 GMT 30908 LOG: using stale statistics instead of current ones because stats collector is not responding
2016-02-22 02:13:52 GMT 30908 LOG: using stale statistics instead of current ones because stats collector is not responding

There are lots of these mesages within the timeframe. There seems to be a couple of them every 2-4 hours.

> >
> >
> > Best regards,
> > Fredrik Huitfeldt
> >
> >
> > On 7 March 2016 16:35:29 +01:00, Adrian Klaver
> > <<adrian(dot)klaver(at)aklaver(dot)com>> wrote:
> >
> > > On 03/06/2016 10:18 PM, <fredrik(at)huitfeldt(dot)com>
> > > <mailto:<fredrik(at)huitfeldt(dot)com>> wrote:
> > >
> > > HI All,
> > >
> > > i would really appreciate any help I can get on this issue.
> > >
> > > basically, a pg_basebackup + streaming attach, led to a database
> > > that we
> > > could not read from afterwards.
> > >
> > >
> > > From original post:
> > >
> > > <http://www.postgresql.org/message-id/1456919678340.31300.116900@webmail2>
> > >
> > > "The issue remained until we ran a full vacuum analyze on the cluster."
> > >
> > > Which cluster was that, the master or the slave?
> > >
> > > "I have logfiles from the incident, but I cannot see anything out of
> > > the ordinary (despite having a fair amount of experience investigating
> > > postgresql logs)."
> > >
> > >
> > > Can we see the section before and after ERROR?
> > >
> > >
> > > Beset regards,
> > > Fredrik
> > >
> > > PS please advise if this is better posted on another list.
> > >
> > >
> > >
> > > --
> > > Adrian Klaver
> > > <adrian(dot)klaver(at)aklaver(dot)com> <mailto:<adrian(dot)klaver(at)aklaver(dot)com>>
> > >
> >
>
> --
> Adrian Klaver
> <adrian(dot)klaver(at)aklaver(dot)com>
> Best regards,

Fredrik

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Magnus Hagander 2016-03-09 19:17:35 Re: Email address VERP problems (was RE: Does a call to a language handler provide a context/session, and somewhere to keep session data?
Previous Message Alvaro Herrera 2016-03-09 18:03:33 Re: Email address VERP problems (was RE: Does a call to a language handler provide a context/session, and somewhere to keep session data?