Re: High replication lag - Stream Replication

From: Marcelo Kruger <marcelo(dot)kruger(at)neoway(dot)com(dot)br>
To: Shreeyansh Dba <shreeyansh2014(at)gmail(dot)com>
Cc: "pgsql-admin(at)postgresql(dot)org" <pgsql-admin(at)postgresql(dot)org>
Subject: Re: High replication lag - Stream Replication
Date: 2017-11-09 12:06:30
Message-ID: CAN8toQOaky5Hmv1rxKfzxApYYUGUFzc0Ooy09dAhff_krzmzyQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Hi Shreeyansh,

I changed the log_min_duration_statement parameter to -1, but there was no
improvement in the database. The application of the archives by Stream
Replication is still slow.

In my usage scenario we performed large table creation operations. We
create several tables in a short time with more than 40GB. We currently
have a 500GB growth per month in the database.

Transactions in one day can generate more than 30000 archives (480GB).

Are there any other parameters that can be changed to improve performance
in the application of archives?

2017-11-09 2:43 GMT-02:00 Shreeyansh Dba <shreeyansh2014(at)gmail(dot)com>:

> On Wed, Nov 8, 2017 at 9:34 PM, Marcelo Kruger <
> marcelo(dot)kruger(at)neoway(dot)com(dot)br> wrote:
>
>> Hi Shreeyansh,
>>
>> Thank you for your help.
>>
>> The lag is not occurring in the recording of the archive in production,
>> nor in sending to StandBy. The lag is occurring in the archive application
>> in StandBy. And this slowness increases with running queries in the StandBy
>> database.
>>
>> The process of recovering is consuming little CPU, and little I / O.
>>
>> postgres: startup process recovering 0000000100001B4300000006
>>
>> This parameter *log_min_duration_statement *should be changed on the
>> StandBy server?
>>
>
> The parameter should be changed On both master as well as slave as this
> result towards heavy load on the server causing slower execution of the
> queries due to higher resource consumption.
>
> This parameter normally set capturing bottleneck queries by setting the
> higher values to minimize the write activities and load on the servers.
>
>
>
>
>>
>> 2017-11-08 12:09 GMT-02:00 Shreeyansh Dba <shreeyansh2014(at)gmail(dot)com>:
>>
>>>
>>>
>>> On Wed, Nov 8, 2017 at 7:08 PM, Marcelo Kruger <
>>> marcelo(dot)kruger(at)neoway(dot)com(dot)br> wrote:
>>>
>>>> Good afternoon,
>>>>
>>>> I have two servers hosted on Azure with PostgreSQL 9.6 databases. One
>>>> database is production, the other is standby.
>>>> For replication I use stream replication between servers. However the
>>>> application of the archives in the standby database is slow, always
>>>> generating replication lag.
>>>>
>>>> The replication lag increases considerably when queries are performed
>>>> on the StandBy database. I wonder if there is any setting that can be made
>>>> to improve this performance.
>>>>
>>>> *Server Production*
>>>> SO: CentOS Linux release 7.4.1708 (Core) 64Bits
>>>> CPU: Intel(R) Xeon(R) CPU E5-2698B v3 @ 2.00GHz - 15 Cores
>>>> Memory: 240GB
>>>> I/O: 17TB RAID0 (17 Disks (1TB for disk) SSD with read and write cache
>>>> enabled)
>>>>
>>>> *Server StandBy*
>>>> SO: CentOS Linux release 7.4.1708 (Core) 64Bits
>>>> CPU: Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz - 20 Cores
>>>> Memory: 140GB
>>>> I/O: 22TB RAID0 (22 Disks (1TB for disk) SSD with read and write cache
>>>> enabled)
>>>>
>>>> *Configuration PostgreSQL Production*
>>>>
>>>> postgres.conf
>>>>
>>>> listen_addresses = '*'
>>>> port = 5433
>>>> max_connections = 2000
>>>> superuser_reserved_connections = 3
>>>> shared_buffers = 56GB
>>>> work_mem = 29360kB
>>>> maintenance_work_mem = 4GB
>>>> autovacuum_work_mem = 8GB
>>>> max_stack_depth = 5MB
>>>> dynamic_shared_memory_type = posix
>>>> bgwriter_delay = 10ms
>>>> bgwriter_lru_maxpages = 700
>>>> bgwriter_lru_multiplier = 2.0
>>>> fsync = off
>>>> synchronous_commit = on
>>>> full_page_writes = off
>>>> wal_buffers = 1500MB
>>>> wal_writer_delay = 1ms
>>>> checkpoint_timeout = 10min
>>>> max_wal_size = 4GB
>>>> min_wal_size = 2GB
>>>> checkpoint_completion_target = 0.9
>>>> effective_cache_size = 168GB
>>>> default_statistics_target = 500
>>>> log_destination = 'csvlog'
>>>> logging_collector = on
>>>> log_directory = 'pg_log'
>>>> log_filename = 'postgresql-%w.log'
>>>> log_file_mode = 0640
>>>> log_truncate_on_rotation = on
>>>> log_rotation_age = 1d
>>>> log_rotation_size = 600MB
>>>> log_min_duration_statement = 0
>>>> log_checkpoints = off
>>>> log_connections = off
>>>> log_disconnections = off
>>>> log_duration = off
>>>> log_line_prefix = '%t [%p]: [%l-1] db=%d,user=%u,app=%a,client=%h'
>>>> log_lock_waits = on
>>>> log_timezone = 'Brazil/East'
>>>> autovacuum = on
>>>> log_autovacuum_min_duration = -1
>>>> autovacuum_max_workers = 12
>>>> autovacuum_naptime = 30s
>>>> autovacuum_vacuum_threshold = 20
>>>> autovacuum_analyze_threshold = 20
>>>> autovacuum_vacuum_scale_factor = 0.2
>>>> autovacuum_analyze_scale_factor = 0.1
>>>> autovacuum_vacuum_cost_delay = 20
>>>> autovacuum_vacuum_cost_limit = 800
>>>> vacuum_cost_delay = 20
>>>> vacuum_cost_limit = 800
>>>> datestyle = 'iso, mdy'
>>>> timezone = 'Brazil/East'
>>>> lc_messages = 'en_US.UTF-8'
>>>> lc_monetary = 'en_US.UTF-8'
>>>> lc_numeric = 'en_US.UTF-8'
>>>> lc_time = 'en_US.UTF-8'
>>>> default_text_search_config = 'pg_catalog.english'
>>>> max_locks_per_transaction = 256
>>>> pgpool.pg_ctl = '/usr/pgsql-9.6/bin/pg_ctl'
>>>> effective_io_concurrency = 10
>>>> wal_level = hot_standby
>>>> max_wal_senders = 10
>>>> max_replication_slots = 3
>>>> wal_keep_segments = 38400
>>>>
>>>> *Configuration PostgreSQL StandBy*
>>>>
>>>> postgres.conf
>>>>
>>>> listen_addresses = '*'
>>>> port = 5433
>>>> max_connections = 2000
>>>> superuser_reserved_connections = 3
>>>> shared_buffers = 10GB
>>>> work_mem = 183500kB
>>>> maintenance_work_mem = 4GB
>>>> autovacuum_work_mem = 8GB
>>>> max_stack_depth = 5MB
>>>> dynamic_shared_memory_type = posix
>>>> bgwriter_delay = 10ms
>>>> bgwriter_lru_maxpages = 700
>>>> bgwriter_lru_multiplier = 2.0
>>>> fsync = off
>>>> synchronous_commit = on
>>>> full_page_writes = off
>>>> wal_buffers = 16MB
>>>> wal_writer_delay = 1ms
>>>> checkpoint_timeout = 10min
>>>> max_wal_size = 4GB
>>>> min_wal_size = 2GB
>>>> checkpoint_completion_target = 0.9
>>>> effective_cache_size = 104GB
>>>> default_statistics_target = 500
>>>> log_destination = 'csvlog'
>>>> logging_collector = on
>>>> log_directory = 'pg_log'
>>>> log_filename = 'postgresql-%w.log'
>>>> log_file_mode = 0640
>>>> log_truncate_on_rotation = on
>>>> log_rotation_age = 1d
>>>> log_rotation_size = 600MB
>>>> log_min_duration_statement = 0
>>>> log_checkpoints = off
>>>> log_connections = off
>>>> log_disconnections = off
>>>> log_duration = off
>>>> log_line_prefix = '%t [%p]: [%l-1] db=%d,user=%u,app=%a,client=%h'
>>>> log_lock_waits = on
>>>> log_timezone = 'Brazil/East'
>>>> autovacuum = on
>>>> log_autovacuum_min_duration = -1
>>>> autovacuum_max_workers = 3
>>>> autovacuum_naptime = 30s
>>>> autovacuum_vacuum_threshold = 20
>>>> autovacuum_analyze_threshold = 20
>>>> autovacuum_vacuum_scale_factor = 0.2
>>>> autovacuum_analyze_scale_factor = 0.1
>>>> autovacuum_vacuum_cost_delay = 20
>>>> autovacuum_vacuum_cost_limit = 800
>>>> vacuum_cost_delay = 20
>>>> vacuum_cost_limit = 800
>>>> datestyle = 'iso, mdy'
>>>> timezone = 'Brazil/East'
>>>> lc_messages = 'en_US.UTF-8'
>>>> lc_monetary = 'en_US.UTF-8'
>>>> lc_numeric = 'en_US.UTF-8'
>>>> lc_time = 'en_US.UTF-8'
>>>> default_text_search_config = 'pg_catalog.english'
>>>> max_locks_per_transaction = 256
>>>> pgpool.pg_ctl = '/usr/pgsql-9.6/bin/pg_ctl'
>>>> effective_io_concurrency = 10
>>>> max_worker_processes = 18
>>>> hot_standby = on
>>>> hot_standby_feedback = on
>>>>
>>>> recovery.conf
>>>>
>>>> standby_mode = 'on'
>>>> primary_slot_name = 'replicacao'
>>>> primary_conninfo = 'host=bdreplica00 port=5433 user=replicator
>>>> password='
>>>> trigger_file = '/var/lib/pgsql/9.6/data/master_db.conf'
>>>>
>>>> --
>>>>
>>>>
>>>> Marcelo Krüger
>>>> <https://br.linkedin.com/in/marcelo-kr%C3%BCger-40aab127>
>>>>
>>>> Data Integration
>>>>
>>>> Office: +55 48 3333-2030 <+55%2048%203333-2030> | Mobile: +55 48
>>>> 999679585 <+55%2048%2099967-9585>
>>>>
>>>> R. Patrício Farias, 131 - Itacorubi - Florianópolis - 3º andar |
>>>> 88034-132 | Santa Catarina, Brasil
>>>> <https://maps.google.com/?q=R.+Patr%C3%ADcio+Farias,+131+-+Itacorubi+-+Florian%C3%B3polis+-+3%C2%BA+andar+%7C+%E2%80%A888034-132+%7C+Santa+Catarina,+Brasil&entry=gmail&source=g>
>>>>
>>>> São Paulo | Florianópolis | New York
>>>>
>>>> <https://www.linkedin.com/company/1235825/>
>>>> <https://www.facebook.com/neoway.nw/>
>>>>
>>>> Conteúdo confidencial. Caso você não seja o real destinatário deste
>>>> e-mail por favor notifique o remetente e elimine esta mensagem. Privileged
>>>> and confidential. If you are not the intended addressee of this e-mail
>>>> please notify the sender and promptly delete this message.
>>>>
>>>>
>>> Hi Marcelo Kruger,
>>>
>>> From server configuration details provided, looks like a heavy trans DB
>>> & more logging enabled which might be resulting more write act on Master
>>> and Slave.
>>> At our first glance, we see *log_min_duration_statement* is set to 0
>>> which might be writing all the queries into the log resulting heavy writing
>>> activity that maybe cause for replication lag.
>>>
>>>
>>> Hope this helps you.
>>>
>>>
>>> --
>>>
>>>
>>
>>
>> --
>>
>>
>> Marcelo Krüger <https://br.linkedin.com/in/marcelo-kr%C3%BCger-40aab127>
>>
>> Data Integration
>>
>> Office: +55 48 3333-2030 <+55%2048%203333-2030> | Mobile: +55 48
>> 999679585 <+55%2048%2099967-9585>
>>
>> R. Patrício Farias, 131 - Itacorubi - Florianópolis - 3º andar |
>> 88034-132 | Santa Catarina, Brasil
>> <https://maps.google.com/?q=R.+Patr%C3%ADcio+Farias,+131+-+Itacorubi+-+Florian%C3%B3polis+-+3%C2%BA+andar+%7C+%E2%80%A888034-132+%7C+Santa+Catarina,+Brasil&entry=gmail&source=g>
>>
>> São Paulo | Florianópolis | New York
>>
>> <https://www.linkedin.com/company/1235825/>
>> <https://www.facebook.com/neoway.nw/>
>>
>> Conteúdo confidencial. Caso você não seja o real destinatário deste
>> e-mail por favor notifique o remetente e elimine esta mensagem. Privileged
>> and confidential. If you are not the intended addressee of this e-mail
>> please notify the sender and promptly delete this message.
>>
>>
>
>
> --
> <https://shreeyansh.com/>
>

--

Marcelo Krüger <https://br.linkedin.com/in/marcelo-kr%C3%BCger-40aab127>

Data Integration

Office: +55 48 3333-2030 | Mobile: +55 48 999679585

R. Patrício Farias, 131 - Itacorubi - Florianópolis - 3º andar | 88034-132
| Santa Catarina, Brasil

São Paulo | Florianópolis | New York

<https://www.linkedin.com/company/1235825/>
<https://www.facebook.com/neoway.nw/>

Conteúdo confidencial. Caso você não seja o real destinatário deste e-mail
por favor notifique o remetente e elimine esta mensagem. Privileged and
confidential. If you are not the intended addressee of this e-mail please
notify the sender and promptly delete this message.

In response to

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Tom Lane 2017-11-09 14:59:39 Re: PostgreSQL WAL file issue
Previous Message Günce Kaya 2017-11-09 11:59:00 Re: pg_basebackup permission denied error