Re: wal exist in slave but getting err requested WAL segment has already been removed

From: Mariel Cherkassky <mariel(dot)cherkassky(at)gmail(dot)com>
To: Achilleas Mantzios <achill(at)matrix(dot)gatewaynet(dot)com>
Cc: pgsql-admin(at)lists(dot)postgresql(dot)org
Subject: Re: wal exist in slave but getting err requested WAL segment has already been removed
Date: 2018-07-11 13:44:24
Message-ID: CA+t6e1n==PHG525gTGAPmx+W8YC4khnT4aL6zeke6FH51q-ECA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Yes i can see its content. However in the end of its content I'm getting
the next msg :
pg_xlogdump: FATAL: error in WAL record at 2E61/BDF59950: invalid magic
number 0000 in log segment 0000000000002E61000000BD, offset 16105472
Maybe this is the reason behind it ?

2018-07-11 16:39 GMT+03:00 Achilleas Mantzios <achill(at)matrix(dot)gatewaynet(dot)com>
:

> On 11/07/2018 16:32, Mariel Cherkassky wrote:
>
> The wal is available on the standby, not on the primary. It is already in
> the pg_xlog directory of the slave...
>
> Ok but apparently this is not complete. Can you see its contents with
> pg_waldump (or pg_xlogdump) ?
> Do you have any backup mechanism in place? Any WAL shipping / archiving
> mechanism ?
>
>
> 2018-07-11 16:26 GMT+03:00 Achilleas Mantzios <
> achill(at)matrix(dot)gatewaynet(dot)com>:
>
>> On 11/07/2018 16:09, Mariel Cherkassky wrote:
>>
>> Hi,
>> I have in my cluster 3 nodes (1 master version 9.6.3+ 2 slaves version
>> 9.6.3). I configured repmgr (with repmgrd active) v 4.0.4.
>>
>> Suddenly today after a few good weeks I noticed that there is a lag in
>> one of the slaves and the error in the log indicated that the slave didnt
>> get the wal :
>>
>> could not receive data from WAL stream: ERROR: requested WAL segment
>> 0000000900002E61000000BD has already been removed
>>
>> However, when I check if the wal was recieveed :
>> postgres=# select pg_is_in_recovery(),pg_is_xlog
>> _replay_paused(),pg_last_xlog_receive_location(),pg_last_
>> xlog_replay_location();
>> pg_is_in_recovery | pg_is_xlog_replay_paused |
>> pg_last_xlog_receive_location | pg_last_xlog_replay_location
>> -------------------+--------------------------+-------------
>> ------------------+------------------------------
>> t | f | 2E61/BDF5C000
>> | 2E61/BDF5B930
>> (1 row)
>>
>> and I checked in pg_xlog directory :
>> ls -l ../pg_xlog/0000000900002E61000000BD
>> -rw------- 1 postgres postgres 16777216 Jul 11 11:13
>> ../pg_xlog/0000000900002E61000000BD
>>
>> and the xlog is exist.
>>
>>
>> In which node did you check for the file?
>> If the file in the primary is still available, try to compare their
>> md5sum .
>> If you have a working WAL shipping method in place, then add the
>> appropriate line in the recovery.conf of your standby :
>>
>> restore_command = 'rsync somemachine:/somepath/pitr/%f "%p" '
>>
>>
>> Now is my question, why the wal wasnt replayed ?
>> In my repmgr.conf I dont have any parameters regarding recovery just some
>> basic things. The recovery.conf file in the data directory :
>>
>> standby_mode = 'on'
>> primary_conninfo = 'host=xxxxxxx user=repmgr
>> application_name=''psgsqldb2'' connect_timeout=2'
>> recovery_target_timeline = 'latest'
>>
>>
>> any idea ?
>>
>>
>> --
>> Achilleas Mantzios
>> IT DEV Lead
>> IT DEPT
>> Dynacom Tankers Mgmt
>>
>>
>
> --
> Achilleas Mantzios
> IT DEV Lead
> IT DEPT
> Dynacom Tankers Mgmt
>
>

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Kenneth Marshall 2018-07-11 13:50:53 Re: wal exist in slave but getting err requested WAL segment has already been removed
Previous Message Achilleas Mantzios 2018-07-11 13:39:49 Re: wal exist in slave but getting err requested WAL segment has already been removed