From: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | John Naylor <john(dot)naylor(at)enterprisedb(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Heikki Linnakangas <heikki(dot)linnakangas(at)iki(dot)fi> |
Subject: | Re: logical decoding and replication of sequences, take 2 |
Date: | 2023-03-20 17:03:57 |
Message-ID: | 5576654d-8575-137f-d9b6-845a3a3d65ba@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 3/20/23 13:26, Amit Kapila wrote:
> On Mon, Mar 20, 2023 at 5:13 PM Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>>
>> On 3/20/23 12:00, Amit Kapila wrote:
>>> On Mon, Mar 20, 2023 at 1:49 PM Tomas Vondra
>>> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>>>>
>>>>
>>>> I don't understand why we'd need WAL from before the slot is created,
>>>> which happens before copy_sequence so the sync will see a more recent
>>>> state (reflecting all changes up to the slot LSN).
>>>>
>>>
>>> Imagine the following sequence of events:
>>> 1. Operation on a sequence seq-1 which requires WAL. Say, this is done
>>> at LSN 1000.
>>> 2. Some other random operations on unrelated objects. This would
>>> increase LSN to 2000.
>>> 3. Create a slot that uses current LSN 2000.
>>> 4. Copy sequence seq-1 where you will get the LSN value as 1000. Then
>>> you will use LSN 1000 as a starting point to start replication in
>>> sequence sync worker.
>>>
>>> It is quite possible that WAL from LSN 1000 may not be present. Now,
>>> it may be possible that we use the slot's LSN in this case but
>>> currently, it may not be possible without some changes in the slot
>>> machinery. Even, if we somehow solve this, we have the below problem
>>> where we can miss some concurrent activity.
>>>
>>
>> I think the question is what would be the WAL-requiring operation at LSN
>> 1000. If it's just regular nextval(), then we *will* see it during
>> copy_sequence - sequences are not transactional in the MVCC sense.
>>
>> If it's an ALTER SEQUENCE, I guess it might create a new relfilenode,
>> and then we might fail to apply this - that'd be bad.
>>
>> I wonder if we'd allow actually discarding the WAL while building the
>> consistent snapshot, though.
>>
>
> No, as soon as we reserve the WAL location, we update the slot's
> minLSN (replicationSlotMinLSN) which would prevent the required WAL
> from being removed.
>
>> You're however right we can't just decide
>> this based on LSN, we'd probably need to compare the relfilenodes too or
>> something like that ...
>>
>>>> I think the only "issue" are the WAL records after the slot LSN, or more
>>>> precisely deciding which of the decoded changes to apply.
>>>>
>>>>
>>>>> Now, for the second idea which is to directly use
>>>>> pg_current_wal_insert_lsn(), I think we won't be able to ensure that
>>>>> the changes covered by in-progress transactions like the one with
>>>>> Alter Sequence I have given example would be streamed later after the
>>>>> initial copy. Because the LSN returned by pg_current_wal_insert_lsn()
>>>>> could be an LSN after the LSN associated with Alter Sequence but
>>>>> before the corresponding xact's commit.
>>>>
>>>> Yeah, I think you're right - the locking itself is not sufficient to
>>>> prevent this ordering of operations. copy_sequence would have to lock
>>>> the sequence exclusively, which seems bit disruptive.
>>>>
>>>
>>> Right, that doesn't sound like a good idea.
>>>
>>
>> Although, maybe we could use a less strict lock level? I mean, one that
>> allows nextval() to continue, but would conflict with ALTER SEQUENCE.
>>
>
> I don't know if that is a good idea but are you imagining a special
> interface/mechanism just for logical replication because as far as I
> can see you have used SELECT to fetch the sequence values?
>
Not sure what would the special mechanism be? I don't think it could
read the sequence from somewhere else, and due the lack of MVCC we'd
just read same sequence data from the current relfilenode. Or what else
would it do?
The one thing we can't quite do at the moment is locking the sequence,
because LOCK is only supported for tables. So we could either provide a
function to lock a sequence, or locks it and then returns the current
state (as if we did a SELECT).
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Davis | 2023-03-20 17:04:42 | Re: Request for comment on setting binary format output per session |
Previous Message | Tom Lane | 2023-03-20 17:01:30 | Re: Add SHELL_EXIT_CODE to psql |