Re: pgbench: spike in pgbench results(graphs) while testing pg_hint_plan performance

From: sachin kotwal <kotsachin(at)gmail(dot)com>
To: pgsql-performance(at)postgresql(dot)org
Subject: Re: pgbench: spike in pgbench results(graphs) while testing pg_hint_plan performance
Date: 2013-05-24 13:22:22
Message-ID: 1369401742323-5756740.post@n5.nabble.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Thanks for reply.

>This could easily be caused by something outside of the test itself.
Background processes. A monitoring system kicking in to write some data to
disk will cause a drop like this.

>Three things you could look into to try and track down the issue:

>-Did you turn on log_checkpoints to see if one happens to run when the rate
drops?
Yes. I have turn on log_checkpoints.

Here i am sending my postgrsql.conf details:
postgresql.conf for pgbench-client and server machine.

Parameter: Value

Connection setup: Connection Settings

listen_addresses: *
max_connections: 150
superuser_reserved_connections: 10
tcp_keepalives_idle: 60
tcp_keepalives_interval: 5
tcp_keepalives_count: 5
Resource Consumption (Memory)
shared_buffers: 4 GB
maintenance_work_mem: 800 MB

Write Ahead Log

wal_level: archive
wal_sync_method: fdatasync
wal_buffers: 16 MB
synchronous_commit: on

Checkpoint

checkpoint_segments: 300
checkpoint_timeout:15 min
archive_mode: on
archive_command: cp %p /archive/archive_pg_hint_plan/%f

Query Planning

random_page_cost: 2
effective_cache_size: 8 GB
default_statistics_target: 10

Error Reporting and Logging

logging_collector: on
log_directory: /var/log/pgsql
log_filename: postgresql.log
log_rotation_age: 1d
log_rotation_size: 0
log_truncate_on_rotation: off
log_min_duration_statement: 20s
log_checkpoints: on
log_error_verbosity : verbose
log_connections: on
log_line_prefix: %t %p %c-%l %x %q %u, %d, %r, %a
log_lock_waits: on

Automatic Vacuuming

log_autovacuum_min_duration:1 min
autovacuum_max_workers: 4
autovacuum_freeze_max_age: 2000000000
autovacuum_vacuum_cost_limit: 400

>If so, that could be the explanation. Looks like you setup the test to
make this less likely, but checkpoints are tough to eliminate altogether.

>-Does this happen on every test run?
Yes. It happen on every test run.

>Is it at the same time?
No. It is not occurring at same time.There is no as such pattern.

>-You can run "top -bc" to dump snapshots of what the system is doing every
second. With some work you can then figure out what was actually happening
during the two seconds around when the throughput dropped.

I have used vmstat, iostat, sar, pidstat, top etc.

here i am sending some snaps from above tolls logs.

1. postgresql log
From Server:
2013-05-22 12:08:00 IST 19697 519c65a7.4cf1-1 0 LOG: 00000: checkpoint
starting: immediate force wait
2013-05-22 12:08:00 IST 19697 519c65a7.4cf1-2 0 LOCATION:
LogCheckpointStart, xlog.c:7638
2013-05-22 12:08:04 IST 19697 519c65a7.4cf1-3 0 LOG: 00000: checkpoint
complete: wrote 2320 buffers (0.4%); 0 transaction log file(s) added, 0
removed, 1 recycled; write=0.045 s, sync=3.606 s, total=4.058 s; sync
files=48, longest=1.425 s, average=0.075 s

2013-05-22 12:08:05 IST 20587 519c67cd.506b-3 0 postgres_user, pg_bench,
172.26.127.101(33356), [unknown]LOG: 00000: connection authorized:
user=postgres_user database=pg_bench
2013-05-22 12:08:05 IST 20587 519c67cd.506b-4 0 postgres_user, pg_bench,
172.26.127.101(33356), [unknown]LOCATION: PerformAuthentication,
postinit.c:230
2013-05-22 12:13:05 IST 21486 519c68f9.53ee-1 0 [unknown], [unknown], ,
[unknown]LOG: 00000: connection received: host=172.26.127.101 port=33362
2013-05-22 12:13:05 IST 21486 519c68f9.53ee-2 0 [unknown], [unknown], ,
[unknown]LOCATION: BackendInitialize, postmaster.c:3476
2013-05-22 12:13:05 IST 21486 519c68f9.53ee-3 0 postgres_user, pg_bench,
172.26.127.101(33362), [unknown]LOG: 00000: connection authorized:
user=postgres_user database=pg_bench

2013-05-22 12:13:07 IST 19697 519c65a7.4cf1-5 0 LOG: 00000: checkpoint
starting: immediate force wait
2013-05-22 12:13:07 IST 19697 519c65a7.4cf1-6 0 LOCATION:
LogCheckpointStart, xlog.c:7638
2013-05-22 12:13:07 IST 19697 519c65a7.4cf1-7 0 LOG: 00000: checkpoint
complete: wrote 4 buffers (0.0%); 0 transaction log file(s) added, 0
removed, 234 recycled; write=0.012 s, sync=0.082 s, total=0.222 s; sync
files=6, longest=0.028 s, average=0.013 s

----------------------------------------------------------------------------------------------------------------------------------------------------------------
From pgbench-Client:

2013-05-22 12:08:08 IST 3988 519c67d0.f94-3 0 postgres_user, results,
::1(59216), [unknown]LOG: 00000: connection authorized: user=postgres_user
database=results
2013-05-22 12:08:08 IST 3988 519c67d0.f94-4 0 postgres_user, results,
::1(59216), [unknown]LOCATION: PerformAuthentication, postinit.c:230
2013-05-22 12:13:08 IST 4810 519c68fc.12ca-1 0 [unknown], [unknown], ,
[unknown]LOG: 00000: connection received: host=::1 port=59343
2013-05-22 12:13:08 IST 4810 519c68fc.12ca-2 0 [unknown], [unknown], ,
[unknown]LOCATION: BackendInitialize, postmaster.c:3476

==================================================================================
2. iostat log
From Server

2013-05-22 12:09:13.563992 05/22/2013 12:09:10 PM
2013-05-22 12:09:13.564105 Device: tps Blk_read/s Blk_wrtn/s
Blk_read Blk_wrtn
2013-05-22 12:09:13.564187 sda 3.00 0.00 32.00
0 32
2013-05-22 12:09:13.564234
*2013-05-22 12:09:14.563849 05/22/2013 12:09:11 PM
2013-05-22 12:09:15.003184 Device: tps Blk_read/s Blk_wrtn/s
Blk_read Blk_wrtn
2013-05-22 12:09:15.003253 sda 0.00 0.00 0.00
0 0
2013-05-22 12:09:15.003277 *
2013-05-22 12:09:15.563731 05/22/2013 12:09:12 PM
2013-05-22 12:09:15.563836 Device: tps Blk_read/s Blk_wrtn/s
Blk_read Blk_wrtn
2013-05-22 12:09:15.563894 sda 16.00 0.00 144.00
0 144
2013-05-22 12:09:15.563923
----------------------------------------------------------------------------------------------------------------------------------------------------------------
From pgbench-Client

2013-05-22 12:09:13.746632 05/22/2013 12:09:13 PM
2013-05-22 12:09:13.746709 Device: tps Blk_read/s Blk_wrtn/s
Blk_read Blk_wrtn
2013-05-22 12:09:13.746753 sda 0.00 0.00 0.00
0 0
2013-05-22 12:09:13.746774
*2013-05-22 12:09:14.746723 05/22/2013 12:09:14 PM
2013-05-22 12:09:15.003297 Device: tps Blk_read/s Blk_wrtn/s
Blk_read Blk_wrtn
2013-05-22 12:09:15.003382 sda 104.00 0.00 77576.00
0 77576
2013-05-22 12:09:15.003417 *
2013-05-22 12:09:15.747222 05/22/2013 12:09:15 PM
2013-05-22 12:09:15.747324 Device: tps Blk_read/s Blk_wrtn/s
Blk_read Blk_wrtn
2013-05-22 12:09:15.747392 sda 1031.00 8.00 244832.00
8 244832
2013-05-22 12:09:15.747423
==================================================================================
3. vmstat log
From Server
-------------------------------------------------procs
-----------memory---------- ---swap-- -----io---- --system--
-----cpu------ ---timestamp---
2013-05-22 12:09:13.601546 54 0 51008 1640400 193400 28216852 0 0
0 16 90753 320892 63 24 13 0 0 2013-05-22 12:09:10 IST
*2013-05-22 12:09:14.602209 84 0 51008 1640384 193400 28216864 0 0
0 0 91317 319178 64 25 11 0 0 2013-05-22 12:09:11 IST*
2013-05-22 12:09:15.602892 72 0 51008 1640352 193404 28216876 0 0
0 72 60904 203523 39 15 45 0 0 2013-05-22 12:09:12 IST
----------------------------------------------------------------------------------------------------------------------------------------------------------------
From pgbench-Client
-----------------------------------------------procs
-----------memory---------- ---swap-- -----io---- --system--
-----cpu------ ---timestamp---
2013-05-22 12:09:13.777954 6 0 58640 3676280 62260 3838180 0 0
0 0 143224 13530 8 12 80 0 0 2013-05-22 12:09:13 IST
*2013-05-22 12:09:14.778444 0 3 58640 3666952 62260 3843848 0 0
0 74044 126646 12754 8 10 79 3 0 2013-05-22 12:09:14 IST*
2013-05-22 12:09:15.778965 5 0 58640 3663744 62272 3848352 0 0
4 87160 105091 8539 6 8 81 5 0 2013-05-22 12:09:15 IST
==================================================================================
4. sar log
From Server
-------------------------------------------------------------------CPU
%usr %nice %sys %iowait %steal %irq %soft %guest
%idle
2013-05-22 12:09:13.574923 12:09:10 PM all 63.10 0.00 16.91
0.03 0.00 0.00 7.16 0.00 12.80
*2013-05-22 12:09:14.575210 12:09:11 PM all 64.09 0.00
17.09 0.00 0.00 0.00 7.42 0.00 11.40*
2013-05-22 12:09:15.574179 12:09:12 PM all 39.79 0.00 11.03
0.00 0.00 0.00 4.38 0.00 44.80

----------------------------------------------------------------------------------------------------------------------------------------------------------------
From pgbench-Client
-------------------------------------------------------------------CPU
%usr %nice %sys %iowait %steal %irq %soft %guest
%idle
2013-05-22 12:09:13.788155 12:09:13 PM all 8.38 0.00 7.91
0.00 0.00 0.00 3.70 0.00 80.01
*2013-05-22 12:09:14.788134 12:09:14 PM all 7.66 0.00
6.98 3.00 0.00 0.00 3.39 0.00 78.97*
2013-05-22 12:09:15.788362 12:09:15 PM all 5.88 0.00 5.63
4.61 0.00 0.00 2.62 0.00 81.24
==================================================================================
5. top command (top -d 1)log
From Server
12:09:13
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
3362 root 20 0 0 0 0 S 1.0 0.0 34:05.66 kondemand/25
31517 postgres 20 0 15688 1932 952 R 1.0 0.0 0:28.35 top
1 root 20 0 19348 1036 808 S 0.0 0.0 0:10.57 init

*12:09:14
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
31517 postgres 20 0 15688 1932 952 R 2.0 0.0 0:28.37 top
1 root 20 0 19348 1036 808 S 0.0 0.0 0:10.57 init *

12:09:15
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
31517 postgres 20 0 15688 1932 952 R 2.9 0.0 0:28.40 top
143 root 20 0 0 0 0 S 1.0 0.0 1:04.72 events/12
1 root 20 0 19348 1036 808 S 0.0 0.0 0:10.57 init
----------------------------------------------------------------------------------------------------------------------------------------------------------------
From pgbench-Client
12:09:13
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
19157 postgres 20 0 121m 4172 1872 R 98.7 0.1 5:53.45 python
1 root 20 0 19348 580 404 S 0.0 0.0 0:01.41 init

*12:09:14
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
19157 postgres 20 0 121m 4172 1872 R 99.6 0.1 5:54.46 python
18946 postgres 20 0 15424 1676 940 R 2.0 0.0 0:18.97 top
1 root 20 0 19348 580 404 S 0.0 0.0 0:01.41 init *

12:09:15
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
19157 postgres 20 0 121m 4172 1872 R 98.7 0.1 5:55.46 python
2658 root 20 0 4064 264 212 S 1.0 0.0 0:08.64 cpuspeed
18946 postgres 20 0 15424 1676 940 R 1.0 0.0 0:18.98 top
1 root 20 0 19348 580 404 S 0.0 0.0 0:01.41 init

--
View this message in context: http://postgresql.1045698.n5.nabble.com/pgbench-spike-in-pgbench-results-graphs-while-testing-pg-hint-plan-performance-tp5756585p5756740.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message fburgess 2013-05-24 21:44:28 Re: [PERFORM] Very slow inner join query Unacceptable latency.
Previous Message Greg Smith 2013-05-24 12:21:31 Re: pgbench: spike in pgbench results(graphs) while testing pg_hint_plan performance