Re: PostGreSQL Replication and question on maintenance

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: github kran <githubkran(at)gmail(dot)com>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>, pgsql-sql Owner <pgsql-sql-owner(at)lists(dot)postgresql(dot)org>, postgres list <pgsql-sql(at)postgresql(dot)org>, Michael Lewis <mlewis(at)entrata(dot)com>
Subject: Re: PostGreSQL Replication and question on maintenance
Date: 2019-11-15 05:41:35
Message-ID: CAFj8pRB8-ycgcawupU1QEyCWSS2RJ=4O41QYU0YVoiM0Zp9KDA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-sql

pá 15. 11. 2019 v 6:26 odesílatel github kran <githubkran(at)gmail(dot)com> napsal:

>
> Hello postGreSQL Community ,
>>
>>
>>
>> Hope everyone is doing great !!.
>>
>>
>> *Background*
>>
>> We use PostgreSQL Version 10.6 version and heavily use PostgreSQL for our
>> day to day activities to write and read data. We have 2 clusters running
>> PostgreSQL engine , one cluster
>>
>> keeps data up to 60 days and another cluster retains data beyond 1 year.
>> The data is partitioned close to a week( ~evry 5 days a partition) and we
>> have around 5 partitions per month per each table and we have 2 tables
>> primarily so that will be 10 tables a week. So in the cluster-1 we have
>> around 20 partitions and in cluster-2 we have around 160 partitions ( data
>> from 2018). We also want to keep the data for up to 2 years in the
>> cluster-2 to serve the data needs of the customer and so far we reached
>> upto 1 year of maintaining this data.
>>
>>
>>
>> *Current activity*
>>
>> We have a custom weekly migration DB script job that moves data from 1
>> cluster to another cluster what it does is the below things.
>>
>> 1) COPY command to copy the data from cluster-1 and split that data into
>> binary files
>>
>> 2) Writing the binary data into the cluster-2 table
>>
>> 3) Creating indexes after the data is copied.
>>
>>
>>
>> *Problem what we have right now. *
>>
>> When the migration activity runs(weekly) from past 2 times , we saw the
>> cluster read replica instance has restarted as it fallen behind the
>> master(writer instance). Everything
>>
>> after that worked seamlessly but we want to avoid the replica getting
>> restarted. To avoid from restart we started doing smaller binary files and
>> copy those files to the cluster-2
>>
>> instead of writing 1 big file of 450 million records. We were successful
>> in the recent migration as the reader instance didn’t restart after we
>> split 1 big file into multiple files to copy the data over but did restart
>> after the indexes are created on the new table as it could be write
>> intensive.
>>
>>
>>
>> *DB parameters set on migration job*
>>
>> work_mem set to 8 GB and maintenace_work_mem=32 GB.
>>
>
these numbers looks crazy high - how much memory has your server - more
than 1TB?

Indexes per table = 3
>>
>> total indexes for 2 tables = 5
>>
>>
>>
>> *DB size*
>>
>> Cluster-2 = 8.6 TB
>>
>> Cluster-1 = 3.6 TB
>>
>> Peak Table relational rows = 400 - 480 million rows
>>
>> Average table relational rows = 300 - 350 million rows.
>>
>> Per table size = 90 -95 GB , per table index size is about 45 GB
>>
>>
>>
>> *Questions*
>>
>> 1) Can we decrease the maintenace_work_mem to 16 GB and will it slow down
>> the writes to the cluster , with that the reader instance can sync the data
>> slowly ?.
>>
>> 2) Based on the above use case what are your recommendations to keep the
>> data longer up to 2 years ?
>>
>> 3) What other recommendations you recommend ?.
>>
>>
>>
>>
>>
>> Appreciate your replies.
>>
>> THanks
>> githubkran
>>
>>>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Arup Rakshit 2019-11-15 10:01:26 Create array of data from JSONB in PG 9.5
Previous Message github kran 2019-11-15 05:26:20 Fwd: PostGreSQL Replication and question on maintenance

Browse pgsql-sql by date

  From Date Subject
Next Message github kran 2019-11-15 15:10:54 Re: PostGreSQL Replication and question on maintenance
Previous Message github kran 2019-11-15 05:26:20 Fwd: PostGreSQL Replication and question on maintenance