Re: Replaying 48 WAL files takes 80 minutes

From: "ktm(at)rice(dot)edu" <ktm(at)rice(dot)edu>
To: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc: pgsql-performance(at)postgresql(dot)org
Subject: Re: Replaying 48 WAL files takes 80 minutes
Date: 2012-10-29 13:19:17
Message-ID: 20121029131917.GE2872@aart.rice.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Mon, Oct 29, 2012 at 02:05:24PM +0100, Albe Laurenz wrote:
> I am configuring streaming replication with hot standby
> with PostgreSQL 9.1.3 on RHEL 6 (kernel 2.6.32-220.el6.x86_64).
> PostgreSQL was compiled from source.
>
> It works fine, except that starting the standby took for ever:
> it took the system more than 80 minutes to replay 48 WAL files
> and connect to the primary.
>
> Can anybody think of an explanation why it takes that long?
>
> This is decent hardware: 24 cores of AMD Opteron 6174, 128 GB RAM,
> NetApp SAN attached with 8 GBit Fibrechannel (ext4 file system).
> An identical system performed fine in performance tests.
>
> Here is the log; I have edited it for readability:
>
> 2012-10-29 09:22:22.945 database system was interrupted; last known up
> at 2012-10-26 01:11:59 CEST
> 2012-10-29 09:22:22.945 creating missing WAL directory
> "pg_xlog/archive_status"
> 2012-10-29 09:22:22.947 entering standby mode
> 2012-10-29 09:22:23.434 restored log file "00000001000001D1000000C4"
> from archive
> 2012-10-29 09:22:23.453 redo starts at 1D1/C4000020
> 2012-10-29 09:22:25.847 restored log file "00000001000001D1000000C5"
> from archive
> 2012-10-29 09:22:27.457 restored log file "00000001000001D1000000C6"
> from archive
> 2012-10-29 09:22:28.946 restored log file "00000001000001D1000000C7"
> from archive
> 2012-10-29 09:22:30.421 restored log file "00000001000001D1000000C8"
> from archive
> 2012-10-29 09:22:31.243 restored log file "00000001000001D1000000C9"
> from archive
> 2012-10-29 09:22:32.194 restored log file "00000001000001D1000000CA"
> from archive
> 2012-10-29 09:22:33.169 restored log file "00000001000001D1000000CB"
> from archive
> 2012-10-29 09:22:33.565 restored log file "00000001000001D1000000CC"
> from archive
> 2012-10-29 09:23:35.451 restored log file "00000001000001D1000000CD"
> from archive
>
> Everything is nice until here.
> Replaying this WAL file suddenly takes 1.5 minutes instead
> of mere seconds as before.
>
> 2012-10-29 09:24:54.761 restored log file "00000001000001D1000000CE"
> from archive
> 2012-10-29 09:27:23.013 restartpoint starting: time
> 2012-10-29 09:28:12.200 restartpoint complete: wrote 242 buffers
> (0.0%);
> 0 transaction log file(s) added, 0 removed, 0
> recycled;
> write=48.987 s, sync=0.185 s, total=49.184 s;
> sync files=1096, longest=0.016 s, average=0.000
> s
> 2012-10-29 09:28:12.206 recovery restart point at 1D1/CC618278
> 2012-10-29 09:28:31.226 restored log file "00000001000001D1000000CF"
> from archive
>
> Again there is a difference of 2.5 minutes
> between these WAL files, only 50 seconds of
> which were spent in the restartpoint.
>
> From here on it continues in quite the same vein.
> Some WAL files are restored in seconds, but some take
> more than 4 minutes.
>
> I'll skip to the end of the log:
>
> 2012-10-29 10:37:53.809 restored log file "00000001000001D1000000EF"
> from archive
> 2012-10-29 10:38:53.194 restartpoint starting: time
> 2012-10-29 10:39:25.929 restartpoint complete: wrote 161 buffers
> (0.0%);
> 0 transaction log file(s) added, 0 removed, 0
> recycled;
> write=32.661 s, sync=0.066 s, total=32.734 s;
> sync files=251, longest=0.003 s, average=0.000
> s
> 2012-10-29 10:39:25.929 recovery restart point at 1D1/ED95C728
> 2012-10-29 10:42:56.153 restored log file "00000001000001D1000000F0"
> from archive
> 2012-10-29 10:43:53.062 restartpoint starting: time
> 2012-10-29 10:45:36.871 restored log file "00000001000001D1000000F1"
> from archive
> 2012-10-29 10:45:39.832 restartpoint complete: wrote 594 buffers
> (0.0%);
> 0 transaction log file(s) added, 0 removed, 0
> recycled;
> write=106.666 s, sync=0.093 s, total=106.769 s;
> sync files=729, longest=0.004 s, average=0.000
> s
> 2012-10-29 10:45:39.832 recovery restart point at 1D1/EF5D4340
> 2012-10-29 10:46:13.602 restored log file "00000001000001D1000000F2"
> from archive
> 2012-10-29 10:47:38.396 restored log file "00000001000001D1000000F3"
> from archive
> 2012-10-29 10:47:38.962 streaming replication successfully connected to
> primary
>
> I'd be happy if somebody could shed light on this.
>
> Yours,
> Laurenz Albe
>
> PS: Here is the configuration:
>
> name | current_setting
> ------------------------------+---------------------------
> version | PostgreSQL 9.1.3 on
> x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.6 20110731 (Red Hat
> 4.4.6-3), 64-bit
> archive_command | gzip -1 <"%p" | tee
> /POSTGRES/data/exchange/"%f".gz >/POSTGRES/data/backups/ELAK/"%f".gz
> archive_mode | on
> checkpoint_completion_target | 0.9
> checkpoint_segments | 30
> client_encoding | UTF8
> constraint_exclusion | off
> cursor_tuple_fraction | 1
> custom_variable_classes | pg_stat_statements
> default_statistics_target | 1000
> effective_cache_size | 64GB
> hot_standby | on
> lc_collate | de_DE.UTF8
> lc_ctype | de_DE.UTF8
> listen_addresses | *
> log_checkpoints | on
> log_connections | on
> log_destination | csvlog
> log_directory | /POSTGRES/data/logs/ELAK
> log_disconnections | on
> log_filename | ELAK-%Y-%m-%d.log
> log_lock_waits | on
> log_min_duration_statement | 3s
> log_min_error_statement | log
> log_min_messages | log
> log_rotation_size | 0
> log_statement | ddl
> log_temp_files | 0
> logging_collector | on
> maintenance_work_mem | 1GB
> max_connections | 800
> max_prepared_transactions | 800
> max_stack_depth | 9MB
> max_standby_archive_delay | 0
> max_standby_streaming_delay | 0
> max_wal_senders | 2
> pg_stat_statements.max | 5000
> pg_stat_statements.track | all
> port | 55503
> server_encoding | UTF8
> shared_buffers | 16GB
> shared_preload_libraries | pg_stat_statements,passwordcheck
> ssl | on
> tcp_keepalives_count | 0
> tcp_keepalives_idle | 0
> TimeZone | Europe/Vienna
> wal_buffers | 16MB
> wal_level | hot_standby
> work_mem | 8MB
> (49 rows)
>
>
Hi Albe,

My first guess would be that there was something using I/O resources on your
NetApp. That is the behavior you would expect once the I/O cache on the NetApp
has been filled and you actually have to perform writes to the underlying
disks. Is this a dedicated box? Can you get I/O stats from the box during the
recovery?

Regards,
Ken
> --
> Sent via pgsql-performance mailing list (pgsql-performance(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Shaun Thomas 2012-10-29 13:25:15 Re: Prepared statements slow in 9.2 still (bad query plan)
Previous Message Albe Laurenz 2012-10-29 13:05:24 Replaying 48 WAL files takes 80 minutes