From: | Grigory Smolkin <g(dot)smolkin(at)postgrespro(dot)ru> |
---|---|
To: | pgsql-bugs(at)lists(dot)postgresql(dot)org |
Subject: | Re: logical replication: could not create file "state.tmp": File exists |
Date: | 2019-12-02 16:54:50 |
Message-ID: | c051f18b-fc8f-50d4-53e5-83e750265417@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On 12/2/19 7:12 PM, Andres Freund wrote:
> Hi,
>
> On 2019-11-30 15:09:39 +0300, Grigory Smolkin wrote:
>> One of my colleagues encountered an out of space condition, which broke his
>> logical replication setup.
>> It`s manifested with the following errors:
>>
>> ERROR: could not receive data from WAL stream: ERROR: could not create
>> file "pg_replslot/some_sub/state.tmp": File exists
> Hm. What was the log output leading to this state? Some cases of this
> would end up in a PANIC, which'd remove the .tmp file during
> recovery. But there's some where we won't - it seems the right fix for
> this would be to unlink the tmp file in that case?
>
>
>> I`ve digged a bit into this problem, and it`s turned out that in
>> SaveSlotToPath() temp file for replication slot is opened with 'O_CREAT |
>> O_EXCL' flags, which makes this routine as not very reentrant.
>>
>> Since an exclusive lock is taken before temp file creation, I think it
>> should be safe to replace O_EXCL with O_TRUNC.
> I'm very doubtful about this. I think it's a good safety measure to
> ensure that there's no previous state file that we're somehow
> overwriting.
Is it possible with exclusive lock taken before that?
>
>
>> Script to reproduce and patch are attached.
> Well:
>
>> # Imitate out_of_space/write_operation_error
>> touch ${PGDATA_PUB}/pg_replslot/mysub/state.tmp
> Doesn't really replicate how we got into this state...
But it replicate the exactly the same state we would get, if write() to
temp file would have failed with out of space.
>
> Greetings,
>
> Andres Freund
>
>
--
Grigory Smolkin
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
From | Date | Subject | |
---|---|---|---|
Next Message | Dmitry Vasiliev | 2019-12-02 17:35:53 | Re: logical replication: could not create file "state.tmp": File exists |
Previous Message | Tomas Vondra | 2019-12-02 16:19:20 | Re: BUG #16144: Segmentation fault on dict_int extension |