From: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Rahila Syed <rahilasyed90(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com> |
Subject: | Re: Column Filtering in Logical Replication |
Date: | 2022-03-11 12:50:29 |
Message-ID: | 2b8471f9-3849-c91f-ef82-9181d4bca0a7@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 3/11/22 10:52, Amit Kapila wrote:
> On Fri, Mar 11, 2022 at 7:26 AM Tomas Vondra
> <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
>>
>> On 3/10/22 20:10, Tomas Vondra wrote:
>>>
>>>
>>> FWIW I think the reason is pretty simple - pgoutput_row_filter_init is
>>> broken. It assumes you can just do this
>>>
>>> rftuple = SearchSysCache2(PUBLICATIONRELMAP,
>>> ObjectIdGetDatum(entry->publish_as_relid),
>>> ObjectIdGetDatum(pub->oid));
>>>
>>> if (HeapTupleIsValid(rftuple))
>>> {
>>> /* Null indicates no filter. */
>>> rfdatum = SysCacheGetAttr(PUBLICATIONRELMAP, rftuple,
>>> Anum_pg_publication_rel_prqual,
>>> &pub_no_filter);
>>> }
>>> else
>>> {
>>> pub_no_filter = true;
>>> }
>>>
>>>
>>> and pub_no_filter=true means there's no filter at all. Which is
>>> nonsense, because we're using publish_as_relid here - the publication
>>> may not include this particular ancestor, in which case we need to just
>>> ignore this publication.
>>>
>>> So yeah, this needs to be reworked.
>>>
>>
>> I spent a bit of time looking at this, and I think a minor change in
>> get_rel_sync_entry() fixes this - it's enough to ensure rel_publications
>> only includes publications that actually include publish_as_relid.
>>
>
> Thanks for looking into this. I think in the first patch before
> calling get_partition_ancestors() we need to ensure it is a partition
> (the call expects that) and pubviaroot is true.
Does the call really require that? Also, I'm not sure why we'd need to
look at pubviaroot - that's already considered earlier when calculating
publish_as_relid, here we just need to know the relationship of the two
OIDs (if one is ancestor/child of the other).
> I think it would be
> good if we can avoid an additional call to get_partition_ancestors()
> as it could be costly.
Maybe. OTOH we only should do this only very rarely anyway.
> I wonder why it is not sufficient to ensure
> that publish_as_relid exists after ancestor in ancestors list before
> assigning the ancestor to publish_as_relid? This only needs to be done
> in case of (if (!publish)). I have not tried this so I could be wrong.
>
I'm not sure what exactly are you proposing. Maybe try coding it? That's
probably faster than trying to describe what the code might do ...
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2022-03-11 12:53:15 | Re: logical decoding and replication of sequences |
Previous Message | Tomas Vondra | 2022-03-11 12:36:11 | Re: Column Filtering in Logical Replication |