Re: Data is copied twice when specifying both child and parent table in publication

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: "wangw(dot)fnst(at)fujitsu(dot)com" <wangw(dot)fnst(at)fujitsu(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, "osumi(dot)takamichi(at)fujitsu(dot)com" <osumi(dot)takamichi(at)fujitsu(dot)com>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Greg Nancarrow <gregn4422(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>
Subject: Re: Data is copied twice when specifying both child and parent table in publication
Date: 2022-07-14 04:45:35
Message-ID: CAHut+Pu=POft2HuOqJt_7-OJHseajic5Ukj8J89qAJQ4wRXXHQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Here are some review comments for the v6 patch (HEAD only):

============
HEAD_v6-0001
============

1. Commit message

If there are two publications that publish the parent table and the child table
separately, and both specify the option PUBLISH_VIA_PARTITION_ROOT, subscribing
to both publications from one subscription causes initial copy twice. What we
expect is to be copied only once.

~

I don’t think the parameter even works in uppercase, so maybe better to say:
PUBLISH_VIA_PARTITION_ROOT -> 'publish_via_partition_root'

~~~

2.

What we expect is to be copied only once.

SUGGESTION
It should only be copied once.

~~~

3.

To fix this, we extend the API of the function pg_get_publication_tables.
Now, the function pg_get_publication_tables could receive the publication list.
And then, if we specify option viaroot, we could exclude the partitioned table
whose ancestor belongs to the publication list when getting the table
informations.

~

Don't you mean "partition table" instead of "partitioned table"?

SUGGESTION (also reworded)
To fix this, the API function pg_get_publication_tables has been
extended to take a publication list. Now, when getting the table
information, if the publish_via_partition_root is true, the function
can exclude a partition table whose ancestor is also published by the
same publication list.

======

4. src/backend/catalog/pg_publication.c - pg_get_publication_tables

- publication = GetPublicationByName(pubname, false);
+ arr = PG_GETARG_ARRAYTYPE_P(0);
+ deconstruct_array(arr, TEXTOID, -1, false, TYPALIGN_INT,
+ &elems, NULL, &nelems);

Maybe should have some comment to describe that this function
parameter is now an array of publications names.

~~~

5.

+ /* get Oids of tables from each publication */

Uppercase comment

~~~

6.

+ ArrayType *arr;
+ Datum *elems;
+ int nelems,
+ i;
+ Publication *publication;
+ bool viaroot = false;
+ List *pub_infos = NIL;
+ ListCell *lc1,
+ *lc2;

The 'publication' should be declared only in the loop that uses it.
It's also not good that this is shadowing the same variable name in a
later declaration.

~~~

7.

+ * Publications support partitioned tables, although all changes
+ * are replicated using leaf partition identity and schema, so we
+ * only need those.
*/
+ if (publication->alltables)
+ current_tables = GetAllTablesPublicationRelations(publication->pubviaroot);
+ else
+ {
+ List *relids,
+ *schemarelids;
+
+ relids = GetPublicationRelations(publication->oid,
+ publication->pubviaroot ?
+ PUBLICATION_PART_ROOT :
+ PUBLICATION_PART_LEAF);
+ schemarelids = GetAllSchemaPublicationRelations(publication->oid,
+ publication->pubviaroot ?
+ PUBLICATION_PART_ROOT :
+ PUBLICATION_PART_LEAF);
+ current_tables = list_concat(relids, schemarelids);
+ }

Somehow I was confused by this comment because it says you only need
the LEAF tables but then the subsequent code is getting ROOT relations
anyway... Can you clarify the comment some more?

~~~

8.

+ bool viaroot = false;

I think that should have a comment something like:
/* At least one publication is using publish_via_partition_root */

~~~

9.

+ /*
+ * Record the published table and the corresponding publication so
+ * that we can get row filters and column list later.
+ */
+ foreach(lc1, tables)
+ {
+ Oid relid = lfirst_oid(lc1);
+
+ foreach(lc2, pub_infos)
+ {
+ pub_info *pubinfo = (pub_info *) lfirst(lc2);
+
+ if (list_member_oid(pubinfo->table_list, relid))
+ {
+ Oid *result = (Oid *) malloc(sizeof(Oid) * 2);
+
+ result[0] = relid;
+ result[1] = pubinfo->pubid;
+
+ results = lappend(results, result);
+ }
+ }
}

I felt a bit uneasy about the double-looping here. I wonder if these
'results' could have been accumulated within the existing loop over
all publications. Then the results would need to be filtered to remove
the ones associated with removed partitions. Otherwise with 10000
tables and also many publications this (current) double-looping seems
like it might be quite expensive.

======

10. src/backend/commands/subscriptioncmds.c - fetch_table_list

+ if (check_columnlist && server_version >= 160000)

This condition does not make much sense to me. Isn’t it effectively
same as saying
if (server_version >= 150000 && server_version >= 160000)

???

~~~

11.

+ /*
+ * Get the list of tables from publisher, the partitioned table whose
+ * ancestor is also in this list should be ignored, otherwise the
+ * initial date in the partitioned table would be replicated twice.
+ */

11.a
Isn't this comment all backwards? I think you mean to say "partition"
or "partition table" (not partitioned table) because partitions have
ancestors but partition-ED tables don't.

11.b
"initial date" -> "initial data"

======

12. src/test/subscription/t/013_partition.pl

-# Note: We create two separate tables, not a partitioned one, so that we can
-# easily identity through which relation were the changes replicated.
+# Note: We only create one table for the partition table (tab4) here.
+# Because we specify option PUBLISH_VIA_PARTITION_ROOT (see pub_all and
+# pub_lower_level above), all data should be replicated to the partition table.
+# So we do not need to create table for the partitioned table.

12.a
AFAIK "tab4" is the *partitioned* table, not a partition. I think this
comment has all the "partitioned" and "partition" back-to-front.

12.b
Also please say “publish_via_partition_root" instead of
PUBLISH_VIA_PARTITION_ROOT

======

13. src/test/subscription/t/028_row_filter.pl

@@ -723,8 +727,11 @@ is($result, qq(t|1), 'check replicated rows to
tab_rowfilter_toast');
# - INSERT (16) YES, 16 > 15
$result =
$node_subscriber->safe_psql('postgres',
- "SELECT a FROM tab_rowfilter_viaroot_part");
-is($result, qq(16), 'check replicated rows to tab_rowfilter_viaroot_part');
+ "SELECT a FROM tab_rowfilter_viaroot_part ORDER BY 1");
+is($result, qq(16
+17),
+ 'check replicated rows to tab_rowfilter_viaroot_part'
+);

There is a comment above that code like:
# tab_rowfilter_viaroot_part filter is: (a > 15)
# - INSERT (14) NO, 14 < 15
# - INSERT (15) NO, 15 = 15
# - INSERT (16) YES, 16 > 15

I think should modify that comment to explain the new data this patch
inserts - e.g. NO for 13 and YES for 17...

------
Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2022-07-14 04:48:35 Re: pg_parameter_aclcheck() and trusted extensions
Previous Message John Naylor 2022-07-14 04:16:55 Re: [PoC] Improve dead tuple storage for lazy vacuum