Quick Links

RE: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher

From:	"houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
To:	Önder Kalacı <onderkalaci(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc:	Peter Smith <smithpb2250(at)gmail(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>, Marco Slot <marco(dot)slot(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	RE: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Date:	2023-03-08 07:24:33
Message-ID:	OS0PR01MB5716B9EC1FCE3FC04C50DB8394B49@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Wednesday, March 8, 2023 2:51 PM houzj(dot)fnst(at)fujitsu(dot)com <houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Tuesday, March 7, 2023 9:47 PM Önder Kalacı <onderkalaci(at)gmail(dot)com>
> wrote:
>
> Hi,
>
> > > > Let me give an example to demonstrate why I thought something is fishy
> here:
> > > >
> > > > Imagine rel has a (non-default) REPLICA IDENTITY with Oid=1111.
> > > > Imagine the same rel has a PRIMARY KEY with Oid=2222.
> > > >
> >
> > Hmm, alright, this is syntactically possible, but not sure if any user
> > would do this. Still thanks for catching this.
> >
> > And, you are right, if a user has created such a schema,
> > IdxIsRelationIdentityOrPK() would return the wrong result and we'd use
> sequential scan instead of index scan.
> > This would be a regression. I think we should change the function.
>
> I am looking at the latest patch and have a question about the following code.
>
> /* Try to find the tuple */
> - if (index_getnext_slot(scan, ForwardScanDirection, outslot))
> + while (index_getnext_slot(scan, ForwardScanDirection, outslot))
> {
> - found = true;
> + /*
> + * Avoid expensive equality check if the index is primary key or
> + * replica identity index.
> + */
> + if (!idxIsRelationIdentityOrPK)
> + {
> + if (eq == NULL)
> + {
> +#ifdef USE_ASSERT_CHECKING
> + /* apply assertions only once for the input
> idxoid */
> + IndexInfo *indexInfo = BuildIndexInfo(idxrel);
> +
> Assert(IsIndexUsableForReplicaIdentityFull(indexInfo));
> +#endif
> +
> + /*
> + * We only need to allocate once. This is
> allocated within per
> + * tuple context -- ApplyMessageContext --
> hence no need to
> + * explicitly pfree().
> + */
> + eq = palloc0(sizeof(*eq) *
> outslot->tts_tupleDescriptor->natts);
> + }
> +
> + if (!tuples_equal(outslot, searchslot, eq))
> + continue;
> + }
>
> IIRC, it invokes tuples_equal for all cases unless we are using replica identity key
> or primary key to scan. But there seem some other cases where the
> tuples_equal looks unnecessary.
>
> For example, if the table on subscriber don't have a PK or RI key but have a
> not-null, non-deferrable, unique key. And if the apply worker choose this index
> to do the scan, it seems we can skip the tuples_equal as well.
>
> --Example
> pub:
> CREATE TABLE test_replica_id_full (a int, b int not null); ALTER TABLE
> test_replica_id_full REPLICA IDENTITY FULL; CREATE PUBLICATION
> tap_pub_rep_full FOR TABLE test_replica_id_full;
>
> sub:
> CREATE TABLE test_replica_id_full (a int, b int not null); CREATE UNIQUE INDEX
> test_replica_id_full_idx ON test_replica_id_full(b);

Thinking again. This example is incorrect, sorry. I mean the case when
all the columns of the tuple to be compared are in the unique index on
subscriber side, like:

--Example
pub:
CREATE TABLE test_replica_id_full (a int);
ALTER TABLE test_replica_id_full REPLICA IDENTITY FULL;
CREATE PUBLICATION tap_pub_rep_full FOR TABLE test_replica_id_full;

sub:
CREATE TABLE test_replica_id_full (a int not null);
CREATE UNIQUE INDEX test_replica_id_full_idx ON test_replica_id_full(a);
--

Best Regards,
Hou zj

In response to

RE: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher at 2023-03-08 06:51:07 from houzj.fnst@fujitsu.com

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Michael Paquier	2023-03-08 07:28:11	Re: Add pg_walinspect function with block info columns
Previous Message	Peter Eisentraut	2023-03-08 07:10:02	Re: Add standard collation UNICODE