Re: row filtering for logical replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Euler Taveira <euler(at)eulerto(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Önder Kalacı <onderkalaci(at)gmail(dot)com>, japin <japinli(at)hotmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, David Steele <david(at)pgmasters(dot)net>, Craig Ringer <craig(at)2ndquadrant(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: row filtering for logical replication
Date: 2021-07-14 09:29:26
Message-ID: CAA4eK1KGP6ek_iWU2BwL3+AERGXY52RZ5sd8Lk9KVvjCS+VBGw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Jul 14, 2021 at 12:37 AM Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>
> On 7/13/21 5:44 PM, Jeff Davis wrote:
> > On Tue, 2021-07-13 at 10:24 +0530, Amit Kapila wrote:
> > Also:
> >
> > * Andres also mentioned that the function should not leak memory.
> > * One use case for this feature is when sharding a table, so the
> > expression should allow things like "hashint8(x) between ...". I'd
> > really like to see this problem solved, as well.
> >
>
> I think built-in functions should be fine, because generally don't get
> dropped etc. (And if you drop built-in function, well - sorry.)
>

I am not sure if all built-in functions are also safe. I think we
can't allow volatile functions (ex. setval) that can update the
database which doesn't seem to be allowed in the historic snapshot.
Similarly, it might not be okay to invoke stable functions that access
the database as those might expect current snapshot. I think immutable
functions should be okay but that brings us to Jeff's question of can
we tie the marking of functions that can be used here with
IMMUTABLE/STABLE/VOLATILE designation? The UDFs might have a higher
risk that something used in those functions can be dropped but I guess
we can address that by using the current snapshot to access the
publication catalog.

> Not sure about the memory leaks - I suppose we'd free memory for each
> row, so this shouldn't be an issue I guess ...
>
> >> I think in the long run one idea to allow UDFs is probably by
> >> explicitly allowing users to specify whether the function is
> >> publication predicate safe and if so, then we can allow such
> >> functions
> >> in the filter clause.
> >
> > This sounds like a better direction. We probably need some kind of
> > catalog information here to say what functions/operators are "safe" for
> > this kind of purpose. There are a couple questions:
> >
>
> Not sure. It's true it's a bit like volatile/stable/immutable categories
> where we can't guarantee those labels are correct, and it's up to the
> user to keep the pieces if they pick the wrong category.
>
> But we can achieve the same goal by introducing a simple GUC called
> dangerous_allow_udf_in_decoding, I think.
>

One guc for all UDFs sounds dangerous.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Peifeng Qiu 2021-07-14 09:30:18 Re: Support kerberos authentication for postgres_fdw
Previous Message Magnus Hagander 2021-07-14 09:09:54 Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?