Re: PG Upgrade with hardlinks, when to start/stop master and replicas

From: Hellmuth Vargas <hivs77(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Martín Fernández <fmartin91(at)gmail(dot)com>, "pgsql-generallists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: PG Upgrade with hardlinks, when to start/stop master and replicas
Date: 2019-02-19 14:19:25
Message-ID: CAN3Qy4qh3f_pugLzM3c91tAi5xv-XMiTjYTYPYSgVXHqjrjiFw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi

But could you do the following procedure?:
pg_upgrade of the master
rysnc with a hot standby
arracar master
hot standby start
stop hot standby and rsync the other hot standby with the migrated hot
standby?

El mar., 19 de feb. de 2019 a la(s) 06:12, Stephen Frost (sfrost(at)snowman(dot)net)
escribió:

> Greetings,
>
> * Martín Fernández (fmartin91(at)gmail(dot)com) wrote:
> > After reading the pg_upgrade documentation multiple times, it seems that
> after running pg_upgrade on the primary instance, we can't start it until
> we run rsync from the primary to the standby. I'm understanding this from
> the following section in the pg_upgrade manual page.
> >
> > ```
> > You will not be running pg_upgrade on the standby servers, but rather
> rsync on the
> > primary. Do not start any servers yet.
> > ```
> >
> > I'm understanding the `any` as primary and standbys.
>
> Yes, that's correct, you shouldn't start up anything yet.
>
> > On the other hand, we've been doing tests that start
> the primary instance as soon as pg_upgrade is done. This tests have worked
> perfectly fine so far. We make the rsync call with the primary instance
> running and the standby can start later on after rsync is done and we copy
> the new configuration files.
>
> This is like taking an online backup of the primary without actually
> doing pg_start_backup / pg_stop_backup and following the protocol for
> that, meaning that the replica will start up without a backup_label and
> will think it's at whatever point in the WAL stream that the pg_control
> file says its at as of whenever the rsync copies that file.
>
> That is NOT SAFE and it's a sure way to end up with corruption.
>
> The rsync while everything is down should be pretty fast, unless you
> have unlogged tables that are big (in which case, you should truncate
> them before shutting down the primary) or temporary tables left around
> (which you should clean up) or just generally other things that a
> replica doesn't normally have.
>
> If you can't have any downtime during this process then, imv, the answer
> is to build out a new replica that will essentially be a 'throw-away',
> move all the read load over to it and then go through the documented
> pg_upgrade process with the primary and the other replicas, then flip
> the traffic back to the primary + original replicas and then you can
> either throw away the replica that was kept online or rebuild it using
> the traditional methods of pg_basebackup (or for a larger system, you
> could use pgbackrest which can run in parallel and is much, much faster
> than pg_basebackup).
>
> > If what we are doing is wrong, we need to run `rsync` before starting
> the primary instance, that would mean that the primary and the standby are
> not usable if pg10 doesn't start correctly in the primary right ?
>
> This is another reason why it's good to have an independent replica, as
> it can be a fail-safe if things go completely south (you can just
> promote it and have it be the primary and then rebuild replicas using
> the regular backup+restore method and figure out what went wrong with
> the pg10 migration).
>
> Thanks!
>
> Stephen
>

--
Cordialmente,

Ing. Hellmuth I. Vargas S.
Esp. Telemática y Negocios por Internet
Oracle Database 10g Administrator Certified Associate
EnterpriseDB Certified PostgreSQL 9.3 Associate

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Stephen Frost 2019-02-19 15:05:38 Re: PG Upgrade with hardlinks, when to start/stop master and replicas
Previous Message Chuck Martin 2019-02-19 14:07:11 Re: HAVING query structured wrong