Re: Fast logical replication jump start with PG 10

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: Olivier Gautherot <olivier(at)gautherot(dot)net>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: Fast logical replication jump start with PG 10
Date: 2018-05-26 17:27:51
Message-ID: 67e76903-b1fb-5a43-4b83-4f721ae2e1b0@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 05/26/2018 06:23 AM, Olivier Gautherot wrote:
> Hi Adrian!
>
> On Sat, May 26, 2018 at 12:11 AM, Adrian Klaver
> <adrian(dot)klaver(at)aklaver(dot)com <mailto:adrian(dot)klaver(at)aklaver(dot)com>> wrote:
>
> On 05/25/2018 06:35 PM, Olivier Gautherot wrote:
>
> Hi Adrian, thanks for your reply. Here is the clarification.
>
> 1) It is indeed a pg_upgrade from 9.2 to 10.4. Depending on the
> test machine, it runs in between 15 and 20 minutes for just over
> 100GB. I can negotiate this time with our customer. The vacuum
> process took another 5 to 7 minutes. This this what I was
> referring to with the 30 minutes (point 3 in your questions)
>
> 2) After pg_upgrade, I published the tables on the database (in
> the sense "CREATE DATABASE") and subscribed to this publication
> on the second server (logical replication). The data copy
> processed started immediately and took around 1 hour. I then
> loaded the indexes, what took > another 2h20m. At that point the
> active-passive cluster was ready to go.
>
>
> The index creation was done on the replicated machine I presume,
> using what command?
>
>
> The sequence on the replicated machine was (pseudo-code to simplify the
> syntax):
> - pg_dump --section=pre-data -h master_machine master_database | psql -h
> replication_machine replication_database
> # This took seconds, "pre-data" discards the indexes
>
> - psql -h replication_machine -c "CREATE SUBSCRIPTION mysub CONNECTION
> "..." PUBLICATION mypub;" replication_database
> # This took about 1 hour for the initial sync
>
> - pg_dump --section=post-data -h master_machine master_database | psql
> -h replication_machine replication_database
> # This took 2h20m to load the various indexes
>
> This sequence follows the recommendation of section 14.4.3 in
> https://www.postgresql.org/docs/10/static/populate.html . If I stick to
> streaming as we do today (e.g. pg_upgrade and then rsync to the
> replication server), I can be ready in about 1 hour (more acceptable for
> the customer).

I am still learning what logical replication is capable of so take the
following with that in mind.

1) I used
pg_basebackup(www.postgresql.org/docs/10/static/app-pgbasebackup.html)
to create a new $DATA directory for a replica instance.

2) I configured the master and the replica for logical replication. Also
changed the copied over conf files to work for the new instance e.g.
changed the port number.

3) I set up the PUBLICATION:

CREATE PUBLICATION everything FOR ALL TABLES;

4) I set up the SUBSCRIPTION:

CREATE SUBSCRIPTION test_sub CONNECTION 'dbname=production user=postgres
port=5432' PUBLICATION everything WITH(copy_data=false);

*NOTE* the copy_data=false.

5) Then I started entering data in the master and it was replicated.

Caveats:

1) This was a small database.

2) The master and replica where on the same machine.

3) There was no activity on the master between the pg_basebackup and the
CREATE PUBLICATION/CREATE SUBSCRIPTION commands.

>
> The reasons for the indexes to take so long is the large number of them
> on big tables (for instance, 7 indexes on a partitioned table, with 3
> partitions of 15GB of data in 30M rows). I will skip the reasons that
> got us there (please no flames, I'm aware of the issue :-) ). I don't
> have definite execution times for the Production environment (in a
> datacenter), which tends to be kind of a lottery in terms of execution
> times compared to testing (on a desktop in the office).
>
>
> Note that the active and the passive databases are on different
> machines.
>
> 4) By "database" I mean the result of "CREATE DATABASE" and we
> have 1 per server (or "cluster" in your terminology - I tend to
> use this word for a group of machines). We are currently using a
> streaming replication
>
>
> Yeah I understand, it is just that database and cluster have
> specific meanings in Postgres and it helps to stick to those
> meanings when discussing replication operations. Lowers the
> confusion level:)
>
> between the 9.2 servers, so it could be a fall-back option after
> the upgrade (I wanted to remove part of the indexes on the
> master to lower the load, reason to use the logical
> replication... if the execution time is not too excessive).
>
>
> So the time you showed was with those indexes removed or not?
>
>
> I did try to synchronize the database with the indexes installed and
> eventually dropped the replication database after a full week-end of
> hectic activity (apparently, the initial sync job was not finished...).
> I will try it again just to make sure but I'm fairly positive that I
> will get to the same result.
>
>
>
> Hope it clarifies the question
> Best regards
> Olivier
>
>
> Olivier Gautherot
> olivier(at)gautherot(dot)net <mailto:olivier(at)gautherot(dot)net>
> <mailto:olivier(at)gautherot(dot)net <mailto:olivier(at)gautherot(dot)net>>
> Cel:+56 98 730 9361
> Skype: ogautherot
> www.gautherot.net <http://www.gautherot.net>
> <http://www.gautherot.net>
> http://www.linkedin.com/in/ogautherot
> <http://www.linkedin.com/in/ogautherot>
>
>
> On Fri, May 25, 2018 at 7:51 PM, Adrian Klaver
> <adrian(dot)klaver(at)aklaver(dot)com <mailto:adrian(dot)klaver(at)aklaver(dot)com>
> <mailto:adrian(dot)klaver(at)aklaver(dot)com
> <mailto:adrian(dot)klaver(at)aklaver(dot)com>>> wrote:
>
>     On 05/25/2018 02:12 PM, Olivier Gautherot wrote:
>
>         Hi,
>
>         I just sent the question on StackOverflow but realized
> that this
>         audience may be more savvy. So sorry in advance for
> cross-posting...
>
>         I'm in the process of upgrading a PG from 9.2 to 10.4.
>         pg_upgrade worked fine on the master and was rather
> fast. The
>         problem is that the database is replicated and I'm
> planning to
>         switch from streaming to logical. The problem is that it is
>         rather slow (30 minutes for the master and over 3 hours
> for the
>         replication, between data transfer and indexes).
>
>
>     I am not clear on what you did, so can you clarify the
> following:
>
>     1) pg_upgrade from 9.2 master instance to 10.4 master
> instance, correct?
>
>     2) What replication are you talking about for the 3 hour value?
>
>     3) What is the 30 minute value referring to?
>
>     4) When you say database are you talking about a Postgres
> cluster or
>     a database in the cluster?
>
>         Is there a way to speed up the replication or should I
> rather
>         stick to streaming replication? As I have only 1
> database on the
>         server, it would not be a show-stopper.
>
>     See 4) above, but if you are talking about a single
> database in a
>     cluster streaming replication will not work for that.
>
>         Thanks in advance
>         Olivier Gautherot
> http://www.linkedin.com/in/ogautherot
> <http://www.linkedin.com/in/ogautherot>
>         <http://www.linkedin.com/in/ogautherot
> <http://www.linkedin.com/in/ogautherot>>
>
>     --     Adrian Klaver
> adrian(dot)klaver(at)aklaver(dot)com <mailto:adrian(dot)klaver(at)aklaver(dot)com>
> <mailto:adrian(dot)klaver(at)aklaver(dot)com
> <mailto:adrian(dot)klaver(at)aklaver(dot)com>>
>
>
>
>
> --
> Adrian Klaver
> adrian(dot)klaver(at)aklaver(dot)com <mailto:adrian(dot)klaver(at)aklaver(dot)com>
>
>
> Olivier

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2018-05-26 18:00:33 Re: Fast logical replication jump start with PG 10
Previous Message Adrian Klaver 2018-05-26 16:37:22 Re: UPDATE from CTE syntax error