Re: In-place upgrade with streaming replicas

From: richard(at)kojedz(dot)in
To: Jerry Sievers <gsievers19(at)comcast(dot)net>
Cc: Álvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, pgsql-admin(at)lists(dot)postgresql(dot)org
Subject: Re: In-place upgrade with streaming replicas
Date: 2025-02-20 07:55:31
Message-ID: 71ed34f8c1ed858262a7cf50b633682f@kojedz.in
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin pgsql-general

Dear Jerry,

So, yes it turns out that some kind of loop must be involved here, as
you described:

1. ensure cluster is running
2. stop primary
3. wait some time
4. stop replicas
5. check if checkpoint locations match. repeat from step 1 if
out-of-sync.

My question here is, the unreliable step here is 3rd one. Can we query
the replica runtime if he did catch up? I mean, that after stopping the
primary, we can obtain the checkpoint location from pg_controldata,
then, can we somehow query the running replica about that?

Thanks in advance,
Richard

2025-02-20 08:49 időpontban Jerry Sievers ezt írta:
> richard(at)kojedz(dot)in writes:
>
>> Dear Alvaro,
>>
>> Thanks for your answers. Unfortunately, I was unaware of a shutdown
>> record, that makes a difference then. So, I definitely must stop the
>> primary first, then use pg_controldata to obtain checkpoint
>> info. Then, can I query the replicas while they are up and running if
>> they've received the shutdown record or not? So, after shutting down
>> the primary, how will I know if a replica has received the mentioned
>> record, and is safe to shutdown?
>
>
>
> Hmmm, not sure about that but what we do, is stop primary, wait a
> $short time, then stop replicas...
>
> Then run pg_controldata on all nodes | filter out only the line
> indicating latest checkpoint and sort -u the output. Expect only a
> single line if all are matched.
>
> You may also wish to first insure that you got the same number of
> lines as total node count before doing the sorting and uniqueing.
>
> Very rarely on our huge systems, we'd have a mismatch after the
> verification in in those cases, our automated upgrade procedure
> restarts all nodes and then does the shutdown and verify check again.
>
> HTH
>
>
>>
>> Thanks for the clarifications.
>>
>> Best regards,
>> Richard
>>
>> 2025-02-19 16:54 időpontban Álvaro Herrera ezt írta:
>>> On 2025-Feb-19, richard(at)kojedz(dot)in wrote:
>>>
>>>> With this, I have the question, that after the shutdown of primary,
>>>> what is
>>>> the guarantee for replicas having the same checkpoint location? Why
>>>> does the
>>>> order of shutting down the servers matter? What would be the really
>>>> exact
>>>> and reliable way to ensure that replicas will have the same
>>>> checkpoint
>>>> location as the primary?
>>> The replicas can't write WAL by themselves, but they will replay
>>> whatever the primary has sent; by shutting down the primary first and
>>> letting the replicas catch up, you ensure that the replicas will
>>> actually receive the shutdown record and replay it. If you shut down
>>> the replicas first, they can obviously never catch up with the
>>> shutdown
>>> checkpoint of the primary.
>>> As I recall, if you do shut down the primary first, one potential
>>> danger
>>> is that the primary fails to send the checkpoint record before
>>> shutting
>>> down, so the replicas won't receive it and obviously will not replay
>>> it;
>>> or simply that they are behind enough that they receive it but don't
>>> replay it.
>>> You could use pg_controldata to read the last checkpoint info from
>>> all
>>> nodes. You can run it on the primary after shutting it down, and
>>> then
>>> on each replica while it's still running to ensure that the correct
>>> restartpoint has been created.

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Dietrich, Benjamin 2025-02-20 15:00:35 May data be corrupted after an interrupted, but afterwards sucessfully replayed recovery?
Previous Message Jerry Sievers 2025-02-20 07:49:03 Re: In-place upgrade with streaming replicas

Browse pgsql-general by date

  From Date Subject
Next Message Tomas Vondra 2025-02-20 09:27:17 Re: Logical decoding
Previous Message Jerry Sievers 2025-02-20 07:49:03 Re: In-place upgrade with streaming replicas