Re: Slot's restart_lsn may point to removed WAL segment after hard restart unexpectedly

From: "Vitaly Davydov" <v(dot)davydov(at)postgrespro(dot)ru>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Slot's restart_lsn may point to removed WAL segment after hard restart unexpectedly
Date: 2024-11-07 13:30:39
Message-ID: 2802b6-672cc100-29-48025b80@131204041
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Dear Hackers,
 
I'd like to introduce an improved version of my patch (see the attached file). My original idea was to take into account saved on disk restart_lsn (slot→restart_lsn_flushed) for persistent slots when removing WAL segment files. It helps tackle errors like: ERROR: requested WAL segment 000...0AA has already been removed.
 
Improvements:
* flushed_restart_lsn is used only for RS_PERSISTENT slots. * Save physical slot on disk when advancing only once - if restart_lsn_flushed is invalid. It is needed because slots with invalid restart LSN are not used when calculating oldest LSN for WAL truncation. Once restart_lsn becomes valid, it should be saved to disk immediately to update restart_lsn_flushed.
Regression tests seems to be ok except:
* recovery/t/001_stream_rep.pl (checkpoint is needed) * recovery/t/019_replslot_limit.pl (it seems, slot was invalidated, some adjustments are needed) * pg_basebackup/t/020_pg_receivewal.pl (not sure about it)
 
There are some problems:
* More WAL segments may be kept. It may lead to invalidations of slots in some tests (recovery/t/019_replslot_limit.pl). A couple of tests should be adjusted.
 
With best regards,
Vitaly Davydov

On Thursday, October 31, 2024 13:32 MSK, "Vitaly Davydov" <v(dot)davydov(at)postgrespro(dot)ru> wrote:

 
Sorry, attached the missed patch.

On Thursday, October 31, 2024 13:18 MSK, "Vitaly Davydov" <v(dot)davydov(at)postgrespro(dot)ru> wrote:

Dear Hackers,
 
I'd like to discuss a problem with replication slots's restart LSN. Physical slots are saved to disk at the beginning of checkpoint. At the end of checkpoint, old WAL segments are recycled or removed from disk, if they are not kept by slot's restart_lsn values.
 
If an existing physical slot is advanced in the middle of checkpoint execution, WAL segments, which are related to saved on disk restart LSN may be removed. It is because the calculation of the replication slot miminal LSN is occured at the end of checkpoint, prior to old WAL segments removal. If to hard stop (pg_stl -m immediate) the postgres instance right after checkpoint and to restart it, the slot's restart_lsn may point to the removed WAL segment. I believe, such behaviour is not good.
 
The doc [0] describes that restart_lsn may be set to the some past value after reload. There is a discussion [1] on pghackers where such behaviour is discussed. The main reason of not flushing physical slots on advancing is a performance reason. I'm ok with such behaviour, except of that the corresponding WAL segments should not be removed.
 
I propose to keep WAL segments by saved on disk (flushed) restart_lsn of slots. Add a new field restart_lsn_flushed into ReplicationSlot structure. Copy restart_lsn to restart_lsn_flushed in SaveSlotToPath. It doesn't change the format of storing the slot contents on disk. I attached a patch. It is not yet complete, but demonstate a way to solve the problem.
 
I reproduced the problem by the following way:
* Add some delay in CheckPointBuffers (pg_usleep) to emulate long checkpoint execution. * Execute checkpoint and pg_replication_slot_advance right after starting of the checkpoint from another connection. * Hard restart the server right after checkpoint completion. * After restart slot's restart_lsn may point to removed WAL segment.
The proposed patch fixes it.
 
[0] https://www.postgresql.org/docs/current/logicaldecoding-explanation.html
[1] https://www.postgresql.org/message-id/flat/059cc53a-8b14-653a-a24d-5f867503b0ee%40postgrespro.ru

 

 

Attachment Content-Type Size
0001-Keep-WAL-segments-by-slot-s-flushed-restart-LSN.patch text/x-patch 6.2 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Guillaume Lelarge 2024-11-07 13:33:07 Re: Add parallel columns for pg_stat_statements
Previous Message Karina Litskevich 2024-11-07 13:08:30 Re: pg_stat_statements: Avoid holding excessive lock