From: | "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com> |
---|---|
To: | "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Cc: | Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, 'Shubham Khanna' <khannashubham1197(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com> |
Subject: | RE: speed up a logical replica setup |
Date: | 2024-01-25 09:05:14 |
Message-ID: | TY3PR01MB9889FEDBBF74A9F79CA7CC87F57A2@TY3PR01MB9889.jpnprd01.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Dear hackers,
Here are comments for v8 patch set. I may revise them by myself,
but I want to post here to share all of them.
01.
```
/* Options */
static char *pub_conninfo_str = NULL;
static SimpleStringList database_names = {NULL, NULL};
static int wait_seconds = DEFAULT_WAIT;
static bool retain = false;
static bool dry_run = false;
```
Just to confirm - is there a policy why we store the specified options? If you
want to store as global ones, username and port should follow (my fault...).
Or, should we have a structure to store them?
02.
```
{"subscriber-conninfo", required_argument, NULL, 'S'},
```
This is my fault, but "--subscriber-conninfo" is still remained. It should be
removed if it is not really needed.
03.
```
{"username", required_argument, NULL, 'u'},
```
Should we accept 'U' instead of 'u'?
04.
```
{"dry-run", no_argument, NULL, 'n'},
```
I'm not sure why the dry_run mode exists. In terms pg_resetwal, it shows the
which value would be changed based on the input. As for the pg_upgrade, it checks
whether the node can be upgraded for now. I think, we should have the checking
feature, so it should be renamed to --check. Also, the process should exit earlier
at that time.
05.
I felt we should accept some settings from enviroment variables, like pg_upgrade.
Currently, below items should be acceted.
- data directory
- username
- port
- timeout
06.
```
pg_logging_set_level(PG_LOG_WARNING);
```
If the default log level is warning, there are no ways to output debug logs.
(-v option only raises one, so INFO would be output)
I think it should be PG_LOG_INFO.
07.
Can we combine verifications into two functions, e.g., check_primary() and check_standby/check_subscriber()?
08.
Not sure, but if we want to record outputs by pg_subscriber, the sub-directory
should be created. The name should contain the timestamp.
09.
Not sure, but should we check max_slot_wal_keep_size of primary server? It can
avoid to fail starting of logical replicaiton.
10.
```
nslots_new = nslots_old + dbarr.ndbs;
if (nslots_new > max_replication_slots)
{
pg_log_error("max_replication_slots (%d) must be greater than or equal to "
"the number of replication slots required (%d)", max_replication_slots, nslots_new);
exit(1);
}
```
I think standby server must not have replication slots. Because subsequent
pg_resetwal command discards all the WAL file, so WAL records pointed by them
are removed. Currently pg_resetwal does not raise ERROR at that time.
11.
```
/*
* Stop the subscriber if it is a standby server. Before executing the
* transformation steps, make sure the subscriber is not running because
* one of the steps is to modify some recovery parameters that require a
* restart.
*/
if (stat(pidfile, &statbuf) == 0)
```
I kept just in case, but I'm not sure it is still needed. How do you think?
Removing it can reduce an inclusion of pidfile.h.
12.
```
pg_ctl_cmd = psprintf("\"%s/pg_ctl\" stop -D \"%s\" -s",
standby.bindir, standby.pgdata);
rc = system(pg_ctl_cmd);
pg_ctl_status(pg_ctl_cmd, rc, 0);
```
There are two places to stop the instance. Can you divide it into a function?
13.
```
* A temporary replication slot is not used here to avoid keeping a
* replication connection open (depending when base backup was taken, the
* connection should be open for a few hours).
*/
conn = connect_database(primary.base_conninfo, dbarr.perdb[0].dbname);
if (conn == NULL)
exit(1);
consistent_lsn = create_logical_replication_slot(conn, true,
&dbarr.perdb[0]);
```
I didn't notice the comment, but still not sure the reason. Why we must reserve
the slot until pg_subscriber finishes? IIUC, the slot would be never used, it
is created only for getting a consistent_lsn. So we do not have to keep.
Also, just before, logical replication slots for each databases are created, so
WAL records are surely reserved.
14.
```
pg_log_info("starting the subscriber");
start_standby_server(&standby, subport, server_start_log);
```
This info should be in the function.
15.
```
/*
* Create a subscription for each database.
*/
for (i = 0; i < dbarr.ndbs; i++)
```
This can be divided into a function, like create_all_subscriptions().
16.
My fault: usage() must be updated.
17. use_primary_slot_name
```
if (PQntuples(res) != 1)
{
pg_log_error("could not obtain replication slot information: got %d rows, expected %d row",
PQntuples(res), 1);
return NULL;
}
```
Error message should be changed. I think this error means the standby has wrong primary_slot_name, right?
18. misc
Sometimes the pid of pg_subscriber is referred. It can be stored as global variable.
19.
C99-style has been allowed, so loop variables like "i" can be declared in the for-statement, like
```
for (int i = 0; i < MAX; i++)
```
20.
Some comments, docs, and outputs must be fixed when the name is changed.
Best Regards,
Hayato Kuroda
FUJITSU LIMITED
From | Date | Subject | |
---|---|---|---|
Next Message | Bharath Rupireddy | 2024-01-25 09:05:30 | Re: Improve WALRead() to suck data directly from WAL buffers when possible |
Previous Message | Sutou Kouhei | 2024-01-25 08:52:55 | Re: Make COPY format extendable: Extract COPY TO format implementations |