From: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
---|---|
To: | Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> |
Cc: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com> |
Subject: | Re: logical decoding and replication of sequences, take 2 |
Date: | 2023-07-20 14:51:59 |
Message-ID: | 591c59af-6254-127c-3de0-6ef4445ada05@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 7/20/23 09:24, Ashutosh Bapat wrote:
> Thanks Tomas for the updated patches.
>
> Here are my comments on 0006 patch as well as 0002 patch.
>
> On Wed, Jul 19, 2023 at 4:23 PM Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>>
>> I think this is an accurate description of what the current patch does.
>> And I think it's a reasonable behavior.
>>
>> My point is that if this gets released in PG17, it'll be difficult to
>> change, even if it does not change on-disk format.
>>
>
> Yes. I agree. And I don't see any problem even if we are not able to change it.
>
>>
>> I need to think behavior about this a bit more, and maybe check how
>> difficult would be implementing it.
>
> Ok.
>
> In most of the comments and in documentation, there are some phrases
> which do not look accurate.
>
> Change to a sequence is being refered to as "sequence increment". While
> ascending sequences are common, PostgreSQL supports descending sequences as
> well. The changes there will be decrements. But that's not the only case. A
> sequence may be restarted with an older value, in which case the change could
> increment or a decrement. I think correct usage is 'changes to sequence' or
> 'sequence changes'.
>
> Sequence being assigned a new relfilenode is referred to as sequence
> being created. This is confusing. When an existing sequence is ALTERed, we
> will not "create" a new sequence but we will "create" a new relfilenode and
> "assign" it to that sequence.
>
> PFA such edits in 0002 and 0006 patches. Let me know if those look
> correct. I think we
> need similar changes to the documentation and comments in other places.
>
OK, I merged the changes into the patches, with some minor changes to
the wording etc.
>>
>> I did however look at the proposed alternative to the "created" flag.
>> The attached 0006 part ditches the flag with XLOG_SMGR_CREATE decoding.
>> The smgr_decode code needs a review (I'm not sure the
>> skipping/fast-forwarding part is correct), but it seems to be working
>> fine overall, although we need to ensure the WAL record has the correct XID.
>>
>
> Briefly describing the patch. When decoding a XLOG_SMGR_CREATE WAL
> record, it adds the relfilenode mentioned in it to the sequences hash.
> When decoding a sequence change record, it checks whether the
> relfilenode in the WAL record is in hash table. If it is the sequence
> changes is deemed transactional otherwise non-transactional. The
> change looks good to me. It simplifies the logic to decide whether a
> sequence change is transactional or not.
>
Right.
> In sequence_decode() we skip sequence changes when fast forwarding.
> Given that smgr_decode() is only to supplement sequence_decode(), I
> think it's correct to do the same in smgr_decode() as well. Simillarly
> skipping when we don't have full snapshot.
>
I don't follow, smgr_decode already checks ctx->fast_forward.
> Some minor comments on 0006 patch
>
> + /* make sure the relfilenode creation is associated with the XID */
> + if (XLogLogicalInfoActive())
> + GetCurrentTransactionId();
>
> I think this change is correct and is inline with similar changes in 0002. But
> I looked at other places from where DefineRelation() is called. For regular
> tables it is called from ProcessUtilitySlow() which in turn does not call
> GetCurrentTransactionId(). I am wondering whether we are just discovering a
> class of bugs caused by not associating an xid with a newly created
> relfilenode.
>
Not sure. Why would it be a bug?
> + /*
> + * If we don't have snapshot or we are just fast-forwarding, there is no
> + * point in decoding changes.
> + */
> + if (SnapBuildCurrentState(builder) < SNAPBUILD_FULL_SNAPSHOT ||
> + ctx->fast_forward)
> + return;
>
> This code block is repeated.
>
Fixed.
> +void
> +ReorderBufferAddRelFileLocator(ReorderBuffer *rb, TransactionId xid,
> + RelFileLocator rlocator)
> +{
> ... snip ...
> +
> + /* sequence changes require a transaction */
> + if (xid == InvalidTransactionId)
> + return;
>
> IIUC, with your changes in DefineSequence() in this patch, this should not
> happen. So this condition will never be true. But in case it happens, this code
> will not add the relfilelocation to the hash table and we will deem the
> sequence change as non-transactional. Isn't it better to just throw an error
> and stop replication if that (ever) happens?
>
It can't happen for sequence, but it may happen when creating a
non-sequence relfilenode. In a way, it's a way to skip (some)
unnecessary relfilenodes.
> Also some comments on 0002 patch
>
> @@ -405,8 +405,19 @@ fill_seq_fork_with_data(Relation rel, HeapTuple
> tuple, ForkNumber forkNum)
>
> /* check the comment above nextval_internal()'s equivalent call. */
> if (RelationNeedsWAL(rel))
> + {
> GetTopTransactionId();
>
> + /*
> + * Make sure the subtransaction has a XID assigned, so that
> the sequence
> + * increment WAL record is properly associated with it. This
> matters for
> + * increments of sequences created/altered in the
> transaction, which are
> + * handled as transactional.
> + */
> + if (XLogLogicalInfoActive())
> + GetCurrentTransactionId();
> + }
> +
>
> I think we should separately commit the changes which add a call to
> GetCurrentTransactionId(). That looks like an existing bug/anomaly
> which can stay irrespective of this patch.
>
Not sure, but I don't see this as a bug.
> + /*
> + * To support logical decoding of sequences, we require the sequence
> + * callback. We decide it here, but only check it later in the wrappers.
> + *
> + * XXX Isn't it wrong to define only one of those callbacks? Say we
> + * only define the stream_sequence_cb() - that may get strange results
> + * depending on what gets streamed. Either none or both?
> + *
> + * XXX Shouldn't sequence be defined at slot creation time, similar
> + * to two_phase? Probably not.
> + */
>
> Do you intend to keep these XXX's as is? My previous comments on this comment
> block are in [1].
>
> In fact, given that whether or not sequences are replicated is decided by the
> protocol version, do we really need LogicalDecodingContext::sequences? Drawing
> parallel with WAL messages, I don't think it's needed.
>
Right. We do that for two_phase because you can override that when
creating the subscription - sequences allowed that too initially, but
then we ditched that. So I don't think we need this.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachment | Content-Type | Size |
---|---|---|
0001-Make-test_decoding-ddl.out-shorter-20230720.patch | text/x-patch | 473.3 KB |
0002-Logical-decoding-of-sequences-20230720.patch | text/x-patch | 52.9 KB |
0003-Add-decoding-of-sequences-to-test_decoding-20230720.patch | text/x-patch | 20.6 KB |
0004-Add-decoding-of-sequences-to-built-in-repli-20230720.patch | text/x-patch | 265.7 KB |
0005-Simplify-protocol-versioning-20230720.patch | text/x-patch | 15.6 KB |
0006-replace-created-flag-with-XLOG_SMGR_CREATE-20230720.patch | text/x-patch | 16.9 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Daniel Gustafsson | 2023-07-20 14:52:25 | Re: Printing backtrace of postgres processes |
Previous Message | Daniel Gustafsson | 2023-07-20 14:37:08 | Re: odd buildfarm failure - "pg_ctl: control file appears to be corrupt" |