RE: Logical Replication of sequences

From: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: 'vignesh C' <vignesh21(at)gmail(dot)com>
Cc: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Euler Taveira <euler(at)eulerto(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
Subject: RE: Logical Replication of sequences
Date: 2024-12-27 09:48:33
Message-ID: OSCPR01MB14966DA8CB749A0D4E9F3F7A7F50E2@OSCPR01MB14966.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dear Vignesh,

Thanks for updating the patch! Here are my comments.

01. SyncingRelationsState

```
* SYNC_RELATIONS_STATE_NEEDS_REBUILD The subscription relations state is no
* longer valid and the subscription relations should be rebuilt.
```

Can we follow the style like other lines? Like:

SYNC_RELATIONS_STATE_NEEDS_REBUILD indicates that the subscription relations
state is no longer valid and the subscription relations should be rebuilt.

02. FetchRelationStates()

```
/* Fetch tables and sequences that are in non-ready state. */
rstates = GetSubscriptionRelations(MySubscription->oid, true, true,
false);
```

I think rstates should be list_free_deep()'d after the foreach().

03. LogicalRepSyncSequences

```
char slotname[NAMEDATALEN];
...
snprintf(slotname, NAMEDATALEN, "pg_%u_sync_sequences_" UINT64_FORMAT,
subid, GetSystemIdentifier());

/*
* Here we use the slot name instead of the subscription name as the
* application_name, so that it is different from the leader apply worker,
* so that synchronous replication can distinguish them.
*/
LogRepWorkerWalRcvConn =
walrcv_connect(MySubscription->conninfo, true, true,
must_use_password,
slotname, &err);
```

Hmm, IIUC the sequence sync worker does not require any replication slots.
I feel the variable name should be "application_name" and the comment can be updated.

04. LogicalRepSyncSequences

```
/* Get the sequences that should be synchronized. */
sequences = GetSubscriptionRelations(subid, false, true, false);
```

I think rstates should be list_free_deep()'d after the foreach_ptr().

05. LogicalRepSyncSequences

```
/*
* COPY FROM does not honor RLS policies. That is not a problem for
* subscriptions owned by roles with BYPASSRLS privilege (or
* superuser, who has it implicitly), but other roles should not be
* able to circumvent RLS. Disallow logical replication into RLS
* enabled relations for such roles.
*/
if (check_enable_rls(RelationGetRelid(sequence_rel), InvalidOid, false) == RLS_ENABLED)
ereport(ERROR,
errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("user \"%s\" cannot replicate into sequence with row-level security enabled: \"%s\"",
GetUserNameFromId(GetUserId(), true),
RelationGetRelationName(sequence_rel)));
```

Can we really set a row level security policy to sequences? I've tested but
I couldn't.

```
postgres=# CREATE SEQUENCE s;
CREATE SEQUENCE
postgres=# ALTER TABLE s ENABLE ROW LEVEL SECURITY ;
ERROR: ALTER action ENABLE ROW SECURITY cannot be performed on relation "s"
DETAIL: This operation is not supported for sequences.
```

06. copy_sequence

```
appendStringInfo(&cmd, "SELECT c.oid, c.relkind\n"
"FROM pg_catalog.pg_class c\n"
"INNER JOIN pg_catalog.pg_namespace n\n"
" ON (c.relnamespace = n.oid)\n"
"WHERE n.nspname = %s AND c.relname = %s",
quote_literal_cstr(nspname),
quote_literal_cstr(relname));
```

I feel the function is not so efficient because it can obtain only a tuple, i.e.,
information for one sequence at once. I think you basically copied from fetch_remote_table_info(),
but it was OK for it because the tablesync worker handles only a table.

Can you obtain all sequences at once and check whether each sequences match or not?

07. LogicalRepSyncSequences

```
/* LOG all the sequences synchronized during current batch. */
for (int i = 0; i < curr_batch_seq; i++)
{
SubscriptionRelState *done_seq;

done_seq = (SubscriptionRelState *) lfirst(list_nth_cell(sequences_not_synced,
(curr_seq - curr_batch_seq) + i));

ereport(DEBUG1,
errmsg_internal("logical replication synchronization for subscription \"%s\", sequence \"%s\" has finished",
get_subscription_name(subid, false), get_rel_name(done_seq->relid)));
}
```

The loop is needed only when the debug messages should be output the system.
Can we use message_level_is_interesting() to skip the loop for some cases?

08. pg_dump.c

```
/*
* getSubscriptionTables
* Get information about subscription membership for dumpable tables. This
* will be used only in binary-upgrade mode for PG17 or later versions.
*/
void
getSubscriptionTables(Archive *fout)
```

I was bit confused of the pg_dump codes. I doubt that the pg_upgrade might not
be able to transfer pg_subscripion_rel entries of sequences, but it seemed to work
well because sequences are handled mostly same as normal tables on the pg_dump layer.
But based on other codes, the function should be "getSubscriptionRelations" and
comments should be also updated.

09. logical-replication.sgml

I feel we can add descriptions in "Publication" section. E.g.:

Publications can also include sequences, but the behavior differs from a table
or a group of tables: users can synchronize current states of sequences at an
arbitrary timing. For more details, please see "Replicating Sequences".

10. pg_proc.dat

```
+{ oid => '6313',
+ descr => 'current on-disk sequence state',
+ proname => 'pg_sequence_state', provolatile => 'v',
+ prorettype => 'record', proargtypes => 'regclass',
+ proallargtypes => '{regclass,pg_lsn,int8,int8,bool}',
+ proargmodes => '{i,o,o,o,o}',
+ proargnames => '{seq_oid,page_lsn,last_value,log_cnt,is_called}',
+ prosrc => 'pg_sequence_state' },
```

I think this is not wrong, but according to the src/include/catalog/unused_oids,
the oid should be at 8000-9999 while developing:

```
$ ./unused_oids
...
Patches should use a more-or-less consecutive range of OIDs.
Best practice is to start with a random choice in the range 8000-9999.
Suggested random unused OID: 9293 (415 consecutive OID(s) available starting here)
```

Best regards,
Hayato Kuroda
FUJITSU LIMITED

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2024-12-27 10:16:46 Re: Convert macros to static inline functions
Previous Message RECHTÉ Marc 2024-12-27 09:30:30 Re: Logical replication timeout