From: | Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> |
---|---|
To: | Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io> |
Cc: | Michael Paquier <michael(at)paquier(dot)xyz>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Subject: | Re: pg_receivewal starting position |
Date: | 2021-09-03 15:49:34 |
Message-ID: | CALj2ACVCPPSL99oKG_F8k7g8NkWytw=N6HaYKz--p1U6bCxPyg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Sep 3, 2021 at 3:28 PM Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io> wrote:
> There is still the concern raised by Bharath about being able to select the
> way to fetch the replication slot information for the user, and what should or
> should not be included in the documentation. I am in favor of documenting the
> process of selecting the wal start, and maybe include a --start-lsn option for
> the user to override it, but that's maybe for another patch.
Let's hear from others.
Thanks for the patches. I have some quick comments on the v5 patch-set:
0001:
1) Do you also want to MemSet values too in ReadReplicationSlot?
2) When if clause has single statement we don't generally use "{" and "}"
+ if (slot == NULL || !slot->in_use)
+ {
+ LWLockRelease(ReplicationSlotControlLock);
+ }
you can just have:
+ if (slot == NULL || !slot->in_use)
+ LWLockRelease(ReplicationSlotControlLock);
3) This is still not copying the slot contents, as I said earlier you
can just take the required info into some local variables instead of
taking the slot pointer to slot_contents pointer.
+ /* Copy slot contents while holding spinlock */
+ SpinLockAcquire(&slot->mutex);
+ slot_contents = *slot;
+ SpinLockRelease(&slot->mutex);
+ LWLockRelease(ReplicationSlotControlLock);
4) As I said earlier, why can't we have separate tli variables for
more readability instead of just one slots_position_timeline and
timeline_history variable? And you are not even resetting those 2
variables after if
(!XLogRecPtrIsInvalid(slot_contents.data.restart_lsn)), you might end
up sending the restart_lsn timelineid for confirmed_flush. So, better
use two separate variables. In fact you can use block local variables:
+ if (!XLogRecPtrIsInvalid(slot_contents.data.restart_lsn))
+ {
+ List *tl_history= NIL;
+ TimeLineID tli;
+ tl_history= readTimeLineHistory(ThisTimeLineID);
+ tli = tliOfPointInHistory(slot_contents.data.restart_lsn,
+ tl_history);
+ values[i] = Int32GetDatum(tli);
+ nulls[i] = false;
+ }
similarly for confirmed_flush.
5) I still don't see the need for below test case:
+ "READ_REPLICATION_SLOT does not produce an error with non existent slot");
when we have
+ "READ_REPLICATION_SLOT does not produce an error with dropped slot");
Because for a user, dropped or non existent slot is same, it's just
that for dropped slot we internally don't delete its entry from the
shared memory.
0002:
1) Imagine GetSlotInformation always returns
READ_REPLICATION_SLOT_ERROR, don't you think StreamLog enters an
infinite loop there? Instead, why don't you just exit(1); instead of
return; and retry? Similarly for READ_REPLICATION_SLOT_NONEXISTENT? I
think, you can just do exit(1), no need to retry.
+ case READ_REPLICATION_SLOT_ERROR:
+
+ /*
+ * Error has been logged by GetSlotInformation, return and
+ * maybe retry
+ */
+ return;
2) Why is it returning READ_REPLICATION_SLOT_OK when slot_info isn't
passed? And why are you having this check after you connect to the
server and fetch the data?
+ /* If no slotinformation has been passed, we can return immediately */
+ if (slot_info == NULL)
+ {
+ PQclear(res);
+ return READ_REPLICATION_SLOT_OK;
+ }
Instead you can just have a single assert:
+ Assert(slot_name && slot_info );
3) How about
pg_log_error("could not read replication slot:
instead of
pg_log_error("could not fetch replication slot:
4) Why are you having the READ_REPLICATION_SLOT_OK case in default?
+ default:
+ if (slot_info.is_logical)
+ {
+ /*
+ * If the slot is not physical we can't expect to
+ * recover from that
+ */
+ pg_log_error("cannot use the slot provided, physical slot expected.");
+ exit(1);
+ }
+ stream.startpos = slot_info.restart_lsn;
+ stream.timeline = slot_info.restart_tli;
+ }
You can just have another case statement for READ_REPLICATION_SLOT_OK
and in the default you can throw an error "unknown read replication
slot status" or some other better working and exit(1);
5) Do you want initialize slot_info to 0?
+ if (replication_slot)
+ {
+ SlotInformation slot_info;
+ MemSet(slot_info, 0, sizeof(SlotInformation));
6) This isn't readable:
+ slot_info->is_logical = strcmp(PQgetvalue(res, 0, 0), "logical") == 0;
How about:
if (strcmp(PQgetvalue(res, 0, 0), "logical") == 0)
slot_info->is_logical = true;
You don't need to set it false, because you would have
MemSet(slot_info) in the caller.
7) How about
uint32 hi;
uint32 lo;
instead of
+ uint32 hi,
+ lo;
8) Move SlotInformation * slot_info) to the next line as it crosses
the 80 char limit.
+GetSlotInformation(PGconn *conn, const char *slot_name,
SlotInformation * slot_info)
9) Instead of a boolean is_logical, I would rather suggest to use an
enum or #define macros the slot types correctly, because someday we
might introduce new type slots and somebody wants is_physical kind of
variable name?
+typedef struct SlotInformation {
+ bool is_logical;
Regards,
Bharath Rupireddy.
From | Date | Subject | |
---|---|---|---|
Next Message | Daniel Gustafsson | 2021-09-03 16:24:04 | Re: A reloption for partitioned tables - parallel_workers |
Previous Message | Bharath Rupireddy | 2021-09-03 15:03:45 | Re: Trap errors from streaming child in pg_basebackup to exit early |