Quick Links

Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher

From:	Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To:	Önder Kalacı <onderkalaci(at)gmail(dot)com>
Cc:	"houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Marco Slot <marco(dot)slot(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Date:	2023-03-04 10:33:53
Message-ID:	CAA4eK1+JhdJQerz0x2yx=ecbKng2q0=5QYk51=QXA-Do0Tuvkg@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Fri, Mar 3, 2023 at 1:09 PM Önder Kalacı <onderkalaci(at)gmail(dot)com> wrote:
>
>>
>> 5.
>>
>> + /*
>> + * If index scans are disabled, use a sequential scan.
>> + *
>> + * Note that we do not use index scans below when enable_indexscan is
>> + * false. Allowing primary key or replica identity even when index scan is
>> + * disabled is the legacy behaviour. So we hesitate to move the below
>> + * enable_indexscan check to be done earlier in this function.
>> + */
>> + if (!enable_indexscan)
>> + return InvalidOid;
>>
>> Since the document of enable_indexscan says "Enables or disables the query
>> planner's use of index-scan plan types. The default is on.", and we don't use
>> planner here, so I am not sure should we allow/disallow index scan in apply
>> worker based on this GUC.
>>
>
> Given Amit's suggestion on [1], I'm planning to drop this check altogether, and
> rely on table storage parameters.
>

This still seems to be present in the latest version. I think we can
just remove this and then add the additional check as suggested by you
as part of the second patch.

Few other comments on latest version:
==============================
1.
+/*
+ * Returns true if the index is usable for replica identity full. For details,
+ * see FindUsableIndexForReplicaIdentityFull.
+ */
+bool
+IsIndexUsableForReplicaIdentityFull(IndexInfo *indexInfo)
+{
+ bool is_btree = (indexInfo->ii_Am == BTREE_AM_OID);
+ bool is_partial = (indexInfo->ii_Predicate != NIL);
+ bool is_only_on_expression = IsIndexOnlyOnExpression(indexInfo);
+
+ if (is_btree && !is_partial && !is_only_on_expression)
+ {
+ return true;
...
...
+/*
+ * Returns the oid of an index that can be used via the apply worker. The index
+ * should be btree, non-partial and have at least one column reference (e.g.,
+ * should not consist of only expressions). The limitations arise from
+ * RelationFindReplTupleByIndex(), which is designed to handle PK/RI and these
+ * limitations are inherent to PK/RI.

By these two, the patch infers that it picks an index that adheres to
the limitations of PK/RI. Apart from unique, the other properties of
RI are "not partial, not deferrable, and include only columns marked
NOT NULL". See ATExecReplicaIdentity() for corresponding checks. We
don't try to ensure the last two from the list. It is fine to do so if
we document the reasons for the same in comments or we can even try to
enforce the remaining restrictions as well. For example, it should be
okay to allow NULL column values because we anyway compare the entire
tuple after getting the value from the index.

2.
+ {
+ /*
+ * This attribute is an expression, and
+ * FindUsableIndexForReplicaIdentityFull() was called earlier
+ * when the index for subscriber was selected. There, the indexes
+ * comprising *only* expressions have already been eliminated.
+ *
+ * Also, because PK/RI can't include expressions we
+ * sanity check the index is neither of those kinds.
+ */
+ Assert(!IdxIsRelationIdentityOrPK(rel, idxrel->rd_id));

This comment doesn't make much sense after you have moved the
corresponding Assert in RelationFindReplTupleByIndex(). Either we
should move or remove this Assert as well or at least update the
comments to reflect the latest code.

3. When FindLogicalRepUsableIndex() is invoked from
logicalrep_partition_open(), the current memory context would be
LogicalRepPartMapContext which would be a long-lived context and we
allocate memory for indexes in FindLogicalRepUsableIndex() which can
accumulate over a period of time. So, I think it would be better to
switch to the old context in logicalrep_partition_open() before
invoking FindLogicalRepUsableIndex() provided that is not a long-lived
context.

--
With Regards,
Amit Kapila.

In response to

Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher at 2023-03-03 07:39:43 from Önder Kalacı

Responses

Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher at 2023-03-06 09:08:20 from Önder Kalacı

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Drouvot, Bertrand	2023-03-04 11:19:57	Re: Minimal logical decoding on standbys
Previous Message	Drouvot, Bertrand	2023-03-04 08:33:45	Re: Fix comments in gistxlogDelete, xl_heap_freeze_page and xl_btree_delete