Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Önder Kalacı <onderkalaci(at)gmail(dot)com>
Cc: Marco Slot <marco(dot)slot(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, shiy(dot)fnst(at)fujitsu(dot)com, wangw(dot)fnst(at)fujitsu(dot)com, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: [PATCH] Use indexes on the subscriber when REPLICA IDENTITY is full on the publisher
Date: 2023-01-30 13:16:07
Message-ID: CAA4eK1L6P8hM+17fNyWogSnueTJebvZUX7YseL54HSFpX_0m_A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 27, 2023 at 6:32 PM Önder Kalacı <onderkalaci(at)gmail(dot)com> wrote:
>>
>> I suppose the options are:
>> 1. use regular planner uniformly
>> 2. use regular planner only when there's no replica identity (or configurable?)
>> 3. only use low-level functions
>> 4. keep using sequential scans for every single updated row
>> 5. introduce a hidden logical row identifier in the heap that is guaranteed unique within a table and can be used as a replica identity when no unique index exists
>
>
> One other option I considered was to ask the index explicitly on the subscriber side from the user when REPLICA IDENTITY is FULL. But, it is a pretty hard choice for any user, even a planner sometimes fails to pick the right index :) Also, it is probably controversial to change any of the APIs for this purpose?
>

I agree that it won't be a very convenient option for the user but how
about along with asking for an index from the user (when the user
didn't provide an index), we also allow to make use of any unique
index over a subset of the transmitted columns, and if there's more
than one candidate index pick any one. Additionally, we can allow
disabling the use of an index scan for this particular case. If we are
too worried about API change for allowing users to specify the index
then we can do that later or as a separate patch.

> I'd be happy to hear from more experienced hackers on the trade-offs for the above, and I'd be open to work on that if there is a clear winner. For me (3) is a decent solution for the problem.
>

From the discussion above it is not very clear that adding maintenance
costs in this area is worth it even though that can give better
results as far as this feature is concerned.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Filipp Krylov 2023-01-30 13:17:06 Re: JSONPath Child Operator?
Previous Message Alvaro Herrera 2023-01-30 13:06:09 Re: dynamic result sets support in extended query protocol