| From: | Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com> | 
|---|---|
| To: | Euler Taveira <euler(at)eulerto(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org | 
| Subject: | Re: speed up a logical replica setup | 
| Date: | 2022-03-15 13:51:11 | 
| Message-ID: | b9aa614c-84ba-a869-582f-8d5e3ab57424@enterprisedb.com | 
| Views: | Whole Thread | Raw Message | Download mbox | Resend email | 
| Thread: | |
| Lists: | pgsql-hackers | 
On 21.02.22 13:09, Euler Taveira wrote:
> A new tool called pg_subscriber does this conversion and is tightly 
> integrated
> with Postgres.
Are we comfortable with the name pg_subscriber?  It seems too general.
Are we planning other subscriber-related operations in the future?  If
so, we should at least make this one use a --create option or
something like that.
doc/src/sgml/ref/pg_subscriber.sgml
Attached is a patch that reorganizes the man page a bit.  I moved the
description of the steps to the Notes section and formatted it
differently.  I think the steps are interesting but not essential for
the using of the program, so I wanted to get them out of the main
description.
src/bin/pg_subscriber/pg_subscriber.c
+   if (made_temp_replslot)
+   {
+       conn = connect_database(dbinfo[0].pubconninfo, true);
+       drop_replication_slot(conn, &dbinfo[0], temp_replslot);
+       disconnect_database(conn);
+   }
Temp slots don't need to be cleaned up.
+/*
+ * Obtain the system identifier from control file. It will be used to 
compare
+ * if a data directory is a clone of another one. This routine is used 
locally
+ * and avoids a replication connection.
+ */
+static char *
+get_control_from_datadir(const char *datadir)
This could return uint64 directly, without string conversion.
get_sysid_from_conn() could then convert to uint64 internally.
+ {"verbose", no_argument, NULL, 'v'},
I'm not sure if the --verbose option is all that useful.
+ {"stop-subscriber", no_argument, NULL, 1},
This option doesn't seem to be otherwise supported or documented.
+   pub_sysid = pg_malloc(32);
+   pub_sysid = get_sysid_from_conn(dbinfo[0].pubconninfo);
+   sub_sysid = pg_malloc(32);
+   sub_sysid = get_control_from_datadir(subscriber_dir);
These mallocs don't appears to be of any use.
+ dbname_conninfo = pg_malloc(NAMEDATALEN);
This seems wrong.
Overall, this code could use a little bit more structuring.  There are
a lot of helper functions that don't seem to do a lot and are mostly
duplicate runs-this-SQL-command calls.  But the main() function is
still huge.  There is room for refinement.
src/bin/pg_subscriber/t/001_basic.pl
Good start, but obviously, we'll need some real test cases here also.
  src/bin/initdb/initdb.c
  src/bin/pg_ctl/pg_ctl.c
  src/common/file_utils.c
  src/include/common/file_utils.h
I recommend skipping this refactoring.  The readfile() function from
pg_ctl is not general enough to warrant the pride of place of a
globally available function.  Note that it is specifically geared
toward some of pg_ctl's requirements, for example that the underlying
file can change while it is being read.
The requirements of pg_subscriber can be satisfied more easily: Just
call pg_ctl to start the server.  You are already using that in
pg_subscriber.  Is there a reason it can't be used here as well?
| Attachment | Content-Type | Size | 
|---|---|---|
| 0001-fixup-Create-a-new-logical-replica-from-a-base-backu.patch | text/plain | 8.4 KB | 
| From | Date | Subject | |
|---|---|---|---|
| Next Message | osumi.takamichi@fujitsu.com | 2022-03-15 14:00:52 | RE: Skipping logical replication transactions on subscriber side | 
| Previous Message | Dong Wook Lee | 2022-03-15 13:50:25 | Re: Add pg_freespacemap extension sql test |