Re: row filtering for logical replication

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Euler Taveira <euler(at)eulerto(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: row filtering for logical replication
Date: 2021-08-26 10:11:06
Message-ID: CAHut+PtFxoP477E8odkxkpyDoH_tiNBSiJH-j88NR728nxPErQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 26, 2021 at 3:00 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Thu, Aug 26, 2021 at 9:51 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> >
> > On Thu, Aug 26, 2021 at 1:20 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > On Thu, Aug 26, 2021 at 7:37 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> > > >
> > > > On Wed, Aug 25, 2021 at 3:28 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > > >
> > > > ...
> > > > >
> > > > > Hmm, I think the gain via caching is not visible because we are using
> > > > > simple expressions. It will be visible when we use somewhat complex
> > > > > expressions where expression evaluation cost is significant.
> > > > > Similarly, the impact of this change will magnify and it will also be
> > > > > visible when a publication has many tables. Apart from performance,
> > > > > this change is logically correct as well because it would be any way
> > > > > better if we don't invalidate the cached expressions unless required.
> > > >
> > > > Please tell me what is your idea of a "complex" row filter expression.
> > > > Do you just mean a filter that has multiple AND conditions in it? I
> > > > don't really know if few complex expressions would amount to any
> > > > significant evaluation costs, so I would like to run some timing tests
> > > > with some real examples to see the results.
> > > >
> > >
> > > I think this means you didn't even understand or are convinced why the
> > > patch has cache in the first place. As per your theory, even if we
> > > didn't have cache, it won't matter but that is not true otherwise, the
> > > patch wouldn't have it.
> >
> > I have never said there should be no caching. On the contrary, my
> > performance test results [1] already confirmed that caching ExprState
> > is of benefit for the millions of times it may be used in the
> > pgoutput_row_filter function. My only doubts are in regard to how much
> > observable impact there would be re-evaluating the filter expression
> > just a few extra times by the get_rel_sync_entry function.
> >
>
> I think it depends but why in the first place do you want to allow
> re-evaluation when there is a way for not doing that?

Because the current code logic of having the "delayed" ExprState
evaluation does come at some cost. And the cost is -
a. Needing an extra condition and more code in the function pgoutput_row_filter
b. Needing to maintain the additional Node list

If we chose not to implement a delayed ExprState cache evaluation then
there would still be a (one-time) ExprState cache evaluation but it
would happen whenever get_rel_sync_entry is called (regardless of if
pgoputput_row_filter is subsequently called). E.g. there can be some
rebuilds of the ExprState cache if the user calls TRUNCATE.

I guess I felt the only justification for implementing more
sophisticated cache logic is if gives a performance gain. But if there
is no observable difference, then maybe it's better to just keep the
code simpler. That is why I have been questioning how much time a
one-time ExprState cache evaluation really takes, and would a few
extra ones even be noticeable.

------
Kind Regards,
Peter Smith.
Fujitsu Australia.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2021-08-26 10:41:32 Re: Error code for checksum failure in origin.c
Previous Message Amit Kapila 2021-08-26 10:03:52 Re: Error code for checksum failure in origin.c