From: | "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com> |
---|---|
To: | Önder Kalacı <onderkalaci(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Peter Smith <smithpb2250(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Marco Slot <marco(dot)slot(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | RE: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher |
Date: | 2023-03-08 06:51:07 |
Message-ID: | OS0PR01MB5716D94BDE679A012AB4524694B49@OS0PR01MB5716.jpnprd01.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tuesday, March 7, 2023 9:47 PM Önder Kalacı <onderkalaci(at)gmail(dot)com> wrote:
Hi,
> > > Let me give an example to demonstrate why I thought something is fishy here:
> > >
> > > Imagine rel has a (non-default) REPLICA IDENTITY with Oid=1111.
> > > Imagine the same rel has a PRIMARY KEY with Oid=2222.
> > >
>
> Hmm, alright, this is syntactically possible, but not sure if any user
> would do this. Still thanks for catching this.
>
> And, you are right, if a user has created such a schema, IdxIsRelationIdentityOrPK()
> would return the wrong result and we'd use sequential scan instead of index scan.
> This would be a regression. I think we should change the function.
I am looking at the latest patch and have a question about the following code.
/* Try to find the tuple */
- if (index_getnext_slot(scan, ForwardScanDirection, outslot))
+ while (index_getnext_slot(scan, ForwardScanDirection, outslot))
{
- found = true;
+ /*
+ * Avoid expensive equality check if the index is primary key or
+ * replica identity index.
+ */
+ if (!idxIsRelationIdentityOrPK)
+ {
+ if (eq == NULL)
+ {
+#ifdef USE_ASSERT_CHECKING
+ /* apply assertions only once for the input idxoid */
+ IndexInfo *indexInfo = BuildIndexInfo(idxrel);
+ Assert(IsIndexUsableForReplicaIdentityFull(indexInfo));
+#endif
+
+ /*
+ * We only need to allocate once. This is allocated within per
+ * tuple context -- ApplyMessageContext -- hence no need to
+ * explicitly pfree().
+ */
+ eq = palloc0(sizeof(*eq) * outslot->tts_tupleDescriptor->natts);
+ }
+
+ if (!tuples_equal(outslot, searchslot, eq))
+ continue;
+ }
IIRC, it invokes tuples_equal for all cases unless we are using replica
identity key or primary key to scan. But there seem some other cases where the
tuples_equal looks unnecessary.
For example, if the table on subscriber don't have a PK or RI key but have a
not-null, non-deferrable, unique key. And if the apply worker choose this index
to do the scan, it seems we can skip the tuples_equal as well.
--Example
pub:
CREATE TABLE test_replica_id_full (a int, b int not null);
ALTER TABLE test_replica_id_full REPLICA IDENTITY FULL;
CREATE PUBLICATION tap_pub_rep_full FOR TABLE test_replica_id_full;
sub:
CREATE TABLE test_replica_id_full (a int, b int not null);
CREATE UNIQUE INDEX test_replica_id_full_idx ON test_replica_id_full(b);
--
I am not 100% sure if it's worth optimizing this by complicating the check in
idxIsRelationIdentityOrPK. What do you think ?
Best Regards,
Hou zj
From | Date | Subject | |
---|---|---|---|
Next Message | Julien Rouhaud | 2023-03-08 06:56:14 | Re: pg_upgrade and logical replication |
Previous Message | Peter Eisentraut | 2023-03-08 06:21:58 | Re: Add standard collation UNICODE |