From: | Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io> |
---|---|
To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Cc: | Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> |
Subject: | Re: improve pg_receivewal code |
Date: | 2021-09-02 15:34:59 |
Message-ID: | 5154763.rjdIOjcI6e@aivenronan |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Le lundi 30 août 2021, 09:32:05 CEST Bharath Rupireddy a écrit :
> Hi,
>
> I see there's a scope to do following improvements to pg_receivewal.c:
Thank you Bharath for this patch.
>
> 1) Fetch the server system identifier in the first RunIdentifySystem
> call and use it to identify(via pg_receivewal's ReceiveXlogStream) any
> unexpected changes that may happen in the server while pg_receivewal
> is connected to it. This can be helpful in scenarios when
> pg_receivewal tries to reconnect to the server (see the code around
> pg_usleep with RECONNECT_SLEEP_TIME) but something unexpected has
> happnend in the server that changed the its system identifier. Once
> the pg_receivewal establishes the conenction to server again, then the
> ReceiveXlogStream has a code chunk to compare the system identifier
> that we received in the initial connection.
I'm not sure what kind of problem could be happening here: if you were
somewhat routed to another server ? Or if we "switched" the cluster listening
on that port ?
> 2) Move the RunIdentifySystem to identify timeline id and start LSN
> from the server only if the pg_receivewal failed to get them from
> FindStreamingStart. This way, an extra IDENTIFY_SYSTEM command is
> avoided.
That makes sense, even if we add another IDENTIFY_SYSTEM to check against the
one set in the first place.
> 3) Place the "replication connetion shouldn't have any database name
> associated" error check right after RunIdentifySystem so that we can
> avoid fetching wal segment size with RetrieveWalSegSize if at all we
> were to fail with that error. This change is similar to what
> pg_recvlogical.c does.
Makes sense.
> 4) Move the RetrieveWalSegSize to just before pg_receivewal.c enters
> main loop to get the wal from the server. This avoids an unnecessary
> query for pg_receivewal with "--create-slot" or "--drop-slot".
> 5) Have an assertion after the pg_receivewal done a good amount of
> work to find start timeline and LSN might be helpful:
> Assert(stream.timeline != 0 && stream.startpos != InvalidXLogRecPtr);
>
> Attaching a patch that does take care of above improvements. Thoughts?
Overall I think it is good.
However, you have some formatting issues, here it mixes tabs and spaces:
+ /*
+ * No valid data can be found in the existing output
directory.
+ * Get start LSN position and current timeline ID from
the server.
+ */
And here it is not formatted properly:
+static char *server_sysid = NULL;
>
> Regards,
> Bharath Rupireddy.
--
Ronan Dunklau
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2021-09-02 15:47:55 | Re: [Patch] ALTER SYSTEM READ ONLY |
Previous Message | Robert Haas | 2021-09-02 15:30:59 | Re: using an end-of-recovery record in all cases |