Re: wal segment failed

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Lucas Possamai <drum(dot)lucas(at)gmail(dot)com>
Cc: "pgsql-admin(at)postgresql(dot)org" <pgsql-admin(at)postgresql(dot)org>
Subject: Re: wal segment failed
Date: 2016-05-18 00:54:45
Message-ID: CAKFQuwY9m+57KVWk7hioAvr4QTGvT2fOrWuh8Nm49xhBakuy2g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On Tue, May 17, 2016 at 8:42 PM, David G. Johnston <
david(dot)g(dot)johnston(at)gmail(dot)com> wrote:

> On Tue, May 17, 2016 at 8:32 PM, Lucas Possamai <drum(dot)lucas(at)gmail(dot)com>
> wrote:
>
>> yep so..
>>
>> 1 master and 2 slaves
>> all of those server are working..
>>
>> The only error I got is this one:
>>
>>> Failed to archive *WAL* segment `pg_xlog/00000002000011E800000012` on
>>> host `localhost:30022
>>
>>
>> I'm having spikes that cause me outage every 15 minutes.. I believe the
>> cause of those spikes is that error above.
>>
>> The server was rebooted and a parameter on postgres.conf was changed:
>> shared_buffer.
>>
>> So i don't believe the cause of this is that change.
>> Before the reboot on the server, everything was working.
>>
>> I just can't find the solution.
>>
>> What I did:
>>
>> 1 - I can connect via postgres user between all the servers
>> 2 - the file 00000002000011E800000012 is into the master /pg_xlog (it
>> was already there)
>> 2 - the file 00000002000011E800000012 is into the slaves server
>> /9.2/data/wal_archive (it was already there)
>>
>>
>>
> ​So the question that comes to my mind - taking the above at face value -
> is that the archive_command is failing because it wants to archive said wal
> segment but when it goes to do so it finds that said segment already exists
> in the target location. It correctly fails to potentially corrupt the
> remote file and due to the error will likewise not remove the master
> segment.​
>
> If you are certain, or can become certain, that the remote files are
> identical to the one on the server, it would seem that manually removing
> the wal segment on the master would resolve the deadlock. *I am not
> recommending that you do this. *But it is an option to consider. There
> are too many unknowns still present, and my own inexperience, to actually
> allow me to recommend something definitive.
>
>
​Actually, strike that...the system knows which one it is trying to archive
so simply removing it likely won't work out well. i.e., it probably won't
just move onto the next file in the directory. I'm not positive in either
case.

David J.

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Lucas Possamai 2016-05-18 01:09:29 Re: wal segment failed
Previous Message David G. Johnston 2016-05-18 00:42:58 Re: wal segment failed