High replication lag - Stream Replication

From: Marcelo Kruger <marcelo(dot)kruger(at)neoway(dot)com(dot)br>
To: pgsql-admin(at)postgresql(dot)org
Subject: High replication lag - Stream Replication
Date: 2017-11-08 13:38:41
Message-ID: CAN8toQMumbOxgx-tj+5iKX14p_JcsZthWm+PzmN67-BbSaPzfQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Good afternoon,

I have two servers hosted on Azure with PostgreSQL 9.6 databases. One
database is production, the other is standby.
For replication I use stream replication between servers. However the
application of the archives in the standby database is slow, always
generating replication lag.

The replication lag increases considerably when queries are performed on
the StandBy database. I wonder if there is any setting that can be made to
improve this performance.

*Server Production*
SO: CentOS Linux release 7.4.1708 (Core) 64Bits
CPU: Intel(R) Xeon(R) CPU E5-2698B v3 @ 2.00GHz - 15 Cores
Memory: 240GB
I/O: 17TB RAID0 (17 Disks (1TB for disk) SSD with read and write cache
enabled)

*Server StandBy*
SO: CentOS Linux release 7.4.1708 (Core) 64Bits
CPU: Intel(R) Xeon(R) CPU E5-2673 v3 @ 2.40GHz - 20 Cores
Memory: 140GB
I/O: 22TB RAID0 (22 Disks (1TB for disk) SSD with read and write cache
enabled)

*Configuration PostgreSQL Production*

postgres.conf

listen_addresses = '*'
port = 5433
max_connections = 2000
superuser_reserved_connections = 3
shared_buffers = 56GB
work_mem = 29360kB
maintenance_work_mem = 4GB
autovacuum_work_mem = 8GB
max_stack_depth = 5MB
dynamic_shared_memory_type = posix
bgwriter_delay = 10ms
bgwriter_lru_maxpages = 700
bgwriter_lru_multiplier = 2.0
fsync = off
synchronous_commit = on
full_page_writes = off
wal_buffers = 1500MB
wal_writer_delay = 1ms
checkpoint_timeout = 10min
max_wal_size = 4GB
min_wal_size = 2GB
checkpoint_completion_target = 0.9
effective_cache_size = 168GB
default_statistics_target = 500
log_destination = 'csvlog'
logging_collector = on
log_directory = 'pg_log'
log_filename = 'postgresql-%w.log'
log_file_mode = 0640
log_truncate_on_rotation = on
log_rotation_age = 1d
log_rotation_size = 600MB
log_min_duration_statement = 0
log_checkpoints = off
log_connections = off
log_disconnections = off
log_duration = off
log_line_prefix = '%t [%p]: [%l-1] db=%d,user=%u,app=%a,client=%h'
log_lock_waits = on
log_timezone = 'Brazil/East'
autovacuum = on
log_autovacuum_min_duration = -1
autovacuum_max_workers = 12
autovacuum_naptime = 30s
autovacuum_vacuum_threshold = 20
autovacuum_analyze_threshold = 20
autovacuum_vacuum_scale_factor = 0.2
autovacuum_analyze_scale_factor = 0.1
autovacuum_vacuum_cost_delay = 20
autovacuum_vacuum_cost_limit = 800
vacuum_cost_delay = 20
vacuum_cost_limit = 800
datestyle = 'iso, mdy'
timezone = 'Brazil/East'
lc_messages = 'en_US.UTF-8'
lc_monetary = 'en_US.UTF-8'
lc_numeric = 'en_US.UTF-8'
lc_time = 'en_US.UTF-8'
default_text_search_config = 'pg_catalog.english'
max_locks_per_transaction = 256
pgpool.pg_ctl = '/usr/pgsql-9.6/bin/pg_ctl'
effective_io_concurrency = 10
wal_level = hot_standby
max_wal_senders = 10
max_replication_slots = 3
wal_keep_segments = 38400

*Configuration PostgreSQL StandBy*

postgres.conf

listen_addresses = '*'
port = 5433
max_connections = 2000
superuser_reserved_connections = 3
shared_buffers = 10GB
work_mem = 183500kB
maintenance_work_mem = 4GB
autovacuum_work_mem = 8GB
max_stack_depth = 5MB
dynamic_shared_memory_type = posix
bgwriter_delay = 10ms
bgwriter_lru_maxpages = 700
bgwriter_lru_multiplier = 2.0
fsync = off
synchronous_commit = on
full_page_writes = off
wal_buffers = 16MB
wal_writer_delay = 1ms
checkpoint_timeout = 10min
max_wal_size = 4GB
min_wal_size = 2GB
checkpoint_completion_target = 0.9
effective_cache_size = 104GB
default_statistics_target = 500
log_destination = 'csvlog'
logging_collector = on
log_directory = 'pg_log'
log_filename = 'postgresql-%w.log'
log_file_mode = 0640
log_truncate_on_rotation = on
log_rotation_age = 1d
log_rotation_size = 600MB
log_min_duration_statement = 0
log_checkpoints = off
log_connections = off
log_disconnections = off
log_duration = off
log_line_prefix = '%t [%p]: [%l-1] db=%d,user=%u,app=%a,client=%h'
log_lock_waits = on
log_timezone = 'Brazil/East'
autovacuum = on
log_autovacuum_min_duration = -1
autovacuum_max_workers = 3
autovacuum_naptime = 30s
autovacuum_vacuum_threshold = 20
autovacuum_analyze_threshold = 20
autovacuum_vacuum_scale_factor = 0.2
autovacuum_analyze_scale_factor = 0.1
autovacuum_vacuum_cost_delay = 20
autovacuum_vacuum_cost_limit = 800
vacuum_cost_delay = 20
vacuum_cost_limit = 800
datestyle = 'iso, mdy'
timezone = 'Brazil/East'
lc_messages = 'en_US.UTF-8'
lc_monetary = 'en_US.UTF-8'
lc_numeric = 'en_US.UTF-8'
lc_time = 'en_US.UTF-8'
default_text_search_config = 'pg_catalog.english'
max_locks_per_transaction = 256
pgpool.pg_ctl = '/usr/pgsql-9.6/bin/pg_ctl'
effective_io_concurrency = 10
max_worker_processes = 18
hot_standby = on
hot_standby_feedback = on

recovery.conf

standby_mode = 'on'
primary_slot_name = 'replicacao'
primary_conninfo = 'host=bdreplica00 port=5433 user=replicator password='
trigger_file = '/var/lib/pgsql/9.6/data/master_db.conf'

--

Marcelo Krüger <https://br.linkedin.com/in/marcelo-kr%C3%BCger-40aab127>

Data Integration

Office: +55 48 3333-2030 | Mobile: +55 48 999679585

R. Patrício Farias, 131 - Itacorubi - Florianópolis - 3º andar | 88034-132
| Santa Catarina, Brasil

São Paulo | Florianópolis | New York

<https://www.linkedin.com/company/1235825/>
<https://www.facebook.com/neoway.nw/>

Conteúdo confidencial. Caso você não seja o real destinatário deste e-mail
por favor notifique o remetente e elimine esta mensagem. Privileged and
confidential. If you are not the intended addressee of this e-mail please
notify the sender and promptly delete this message.

Responses

Browse pgsql-admin by date

  From Date Subject
Next Message Douglas Reed 2017-11-08 14:06:26 Problems rebuilding slave using pg_basebackup
Previous Message Douglas Reed 2017-11-08 13:00:20 Re: Problems rebuilding slave using pg_basebackup