Re: out-of-order XID insertion in KnownAssignedXids

From: <fredrik(at)huitfeldt(dot)com>
To: "Andres Freund" <andres(at)anarazel(dot)de>
Cc: "Michael Paquier" <michael(dot)paquier(at)gmail(dot)com>, "pgsql-general" <pgsql-general(at)postgresql(dot)org>, <kgrittn(at)gmail(dot)com>
Subject: Re: out-of-order XID insertion in KnownAssignedXids
Date: 2016-10-24 13:10:41
Message-ID: 1477314642631.66023.33936@webmail2
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi All,

thank you all, I sincerely appreciate your feedback.

I have done a fair amount of testing on the solution proposed by you all (not removing backup_label), and it seems to have completely addressed the issue.

This was actually introduced some time back, and I am not completely certain how it crept into our codebase. I think that at least part of the explanation lies in the fact that we are experiencing a fair amount of growth in the database size and use on some of our installations. This could be the reason why extensive testing did not show the issue back then and why we are seeing it now.

Would it make sense to log a warning in the case of a missing backup_label file, or would it be difficult to identify that situation in the code? I would be happy to dig in and develop a patch?

With regards to the package version; we *are* working with a few "stock" scenarios, where one of them is a fairly old RHEL installation. We also have centos versions that are much more updated.
Best regards, and thank you all again,

Fredrik
On 20 October 2016 at 22:38:26 +02:00, Andres Freund <andres(at)anarazel(dot)de> wrote:

> On 2016-10-20 22:37:15 +0900, Michael Paquier wrote:
>
> > On Thu, Oct 20, 2016 at 10:21 PM, <<fredrik(at)huitfeldt(dot)com>> wrote:
> >
> > > - remove a file called backup_label, but I am not certain that this file is
> > > in fact there (any more).
> > >
> > It is never a good idea when you are trying to restore from a backup,
> > backup_label contains critical information when restoring from a
> > backup, so you may finish with a corrupted data folder.
> >
> And this actually seems like a likely source of these errors. Removing
> a backup label unfortunately causes hard to diagnose errors, because
> everything appears to be ok as long as there's no checkpoints while
> taking the base backups (or when the control file was copied early
> enough). But as soon as a second checkpoint happens before the control
> file is copied...
>
> Fredrik, how did you end up removing the label?
>
> Greetings,
>
> Andres Freund
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Kevin Grittner 2016-10-24 13:59:06 Re: out-of-order XID insertion in KnownAssignedXids
Previous Message Tom Lane 2016-10-24 12:37:55 Re: Errors while installing PostGIS by an unusual method