From: | Elanchezhiyan Elango <elanelango(at)gmail(dot)com> |
---|---|
To: | pgsql-performance(at)postgresql(dot)org |
Subject: | Checkpoints and slow queries |
Date: | 2014-04-27 12:08:07 |
Message-ID: | CALqA5kjPnA8NNtp5SEOMjZmihH_sSUPXjr28=kZwwd2a-9FbAQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
*Problem description:*
After a few days of running in my test environment, a query timed out
(query timeout=4mins). Also in general the queries were taking a lot longer
than expected. The workload in my database is a write intensive workload.
And the writes happen in a burst every 5 minutes. There are a whole bunch
of insert and update queries that run every 5 minutes. When I analyzed the
situation (by enabling more postgres logs), I noticed that postgres
checkpoints were triggering approximately every 5 minutes and based on my
online research I suspected the i/o overhead of checkpoints was affecting
the query performance. The checkpoint related settings were:
checkpoint_segments = 30
checkpoint_timeout = 15min
I modified these settings to the following:
checkpoint_segments = 250
checkpoint_timeout = 1h
checkpoint_completion_target = 0.9
After I tweaked these settings, checkpoints were happening only once in an
hour and that improved the query performance. However, when the checkpoint
happens every hour, the query performance is still very poor. This is still
undesirable to my system. My question is how to track down the reason for
the poor performance during checkpoints and improve the query performance
when the checkpoints happen?
- *EXPLAIN ANALYZE:*
- http://explain.depesz.com/s/BNva - An insert query inserting just
129 rows takes 20 seconds.
- http://explain.depesz.com/s/5hA - An update query updating 43926
rows takes 55 seconds.
- *History:* It gets slower after a few days of the system running.
*Table Metadata*:
- The tables get updated every 5 minutes. Utmost 50000 rows in a table
get updated every 5 minutes. About 50000 rows get inserted every 1 hour.
- There are 90 tables in the DB. 43 of these are updated every 5
minutes. 8/90 tables receive a high update traffic of 50000 updates/5mins.
Remaining tables receive an update traffic of 2000 updates/5min. 43/90
tables are updated every 1 hour.
*PostgreSQL version: *PostgreSQL 9.1.9 on x86_64-unknown-linux-gnu,
compiled by gcc (GCC) 4.6.x-google 20120601 (prerelease), 64-bit
*How you installed PostgreSQL: *Compiled from source and installed.
*Changes made to the settings in the postgresql.conf file:*
name | current_setting | source
------------------------------+------------------------+----------------------
application_name | psql | client
checkpoint_completion_target | 0.9 | configuration file
checkpoint_segments | 250 | configuration file
checkpoint_timeout | 1h | configuration file
client_encoding | SQL_ASCII | client
client_min_messages | error | configuration file
constraint_exclusion | on | configuration file
DateStyle | ISO, MDY | configuration file
default_statistics_target | 800 | configuration file
default_text_search_config | pg_catalog.english | configuration file
effective_cache_size | 4GB | configuration file
lc_messages | C | configuration file
lc_monetary | C | configuration file
lc_numeric | C | configuration file
lc_time | C | configuration file
listen_addresses | localhost | configuration file
log_autovacuum_min_duration | 20s | configuration file
log_checkpoints | on | configuration file
log_connections | on | configuration file
log_destination | syslog | configuration file
log_disconnections | on | configuration file
log_line_prefix | user=%u,db=%d | configuration file
log_lock_waits | on | configuration file
log_min_duration_statement | 1s | configuration file
log_min_messages | error | configuration file
log_temp_files | 0 | configuration file
log_timezone | PST8PDT,M3.2.0,M11.1.0 | environment
variable
maintenance_work_mem | 64MB | configuration file
max_connections | 12 | configuration file
max_locks_per_transaction | 700 | configuration file
max_stack_depth | 2MB | environment
variable
port | 5432 | configuration file
shared_buffers | 500MB | configuration file
ssl | off | configuration file
statement_timeout | 4min | configuration file
syslog_facility | local1 | configuration file
syslog_ident | postgres | configuration file
temp_buffers | 256MB | configuration file
TimeZone | PST8PDT,M3.2.0,M11.1.0 | environment
variable
wal_buffers | 1MB | configuration file
work_mem | 128MB | configuration file
*Operating system and version: *Scientific Linux release 6.1 (Carbon)
*What program you're using to connect to PostgreSQL: *C++ libpqxx library
- *Relevant Schema*: All tables referenced in this question have this
same schema
managed_target_stats=> \d stat_300_3_1
Table "public.stat_300_40110_1"
Column | Type | Modifiers
--------+---------+-----------
ts | integer |
target | bigint |
port | integer |
data | real[] |
Indexes:
"unique_stat_300_40110_1" UNIQUE CONSTRAINT, btree (ts, target, port)
"idx_port_stat_300_40110_1" btree (port)
"idx_target_stat_300_40110_1" btree (target)
"idx_ts_stat_300_40110_1" btree (ts)
- *Hardware*:
- CPU: Intel(R) Xeon(R) CPU E5205 @ 1.86GHz
- Memory: 6GB
- Storage Details:
There are 2 500GB disks (/dev/sda, /dev/sdb) with the following 6
partitions on each disk.
*Number Start End Size Type File system Flags*
1 512B 24.7MB 24.7MB primary boot
2 24.7MB 6473MB 6449MB primary linux-swap(v1)
3 6473MB 40.8GB 34.4GB primary ext3
4 40.8GB 500GB 459GB extended lba
5 40.8GB 408GB 367GB logical ext3
6 408GB 472GB 64.4GB logical ext3
*Disk model and details:*
Model Family: Western Digital RE3 Serial ATA family
Device Model: WDC WD5002ABYS-02B1B0
Serial Number: WD-WCASYD132237
Firmware Version: 02.03B03
User Capacity: 500,107,862,016 bytes
Device is: In smartctl database [for details use: -P show]
ATA Version is: 8
ATA Standard is: Exact ATA specification draft version not indicated
Local Time is: Sun Apr 27 05:05:13 2014 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
The postgres data is stored on a software RAID10 on partition 5 of both
these disks.
[admin(at)chief-cmc2 tmp]# mdadm --detail /dev/md3
/dev/md3:
Version : 0.90
Creation Time : Wed Mar 19 06:40:57 2014
Raid Level : raid10
Array Size : 358402048 (341.80 GiB 367.00 GB)
Used Dev Size : 358402048 (341.80 GiB 367.00 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 3
Persistence : Superblock is persistent
Update Time : Sun Apr 27 04:22:07 2014
State : active
Active Devices : 2
Working Devices : 2
Failed Devices : 0
Spare Devices : 0
Layout : far=2
Chunk Size : 64K
UUID : 79d04a1b:99461915:3d186b3c:53958f34
Events : 0.24
Number Major Minor RaidDevice State
0 8 5 0 active sync /dev/sda5
1 8 21 1 active sync /dev/sdb5
- *Maintenance Setup*: autovacuum is running with default settings. Old
records are deleted every night. I also do 'vacuum full' on a 12 tables
that receive large number of updates every night at 1AM. I have noticed
that these 'vacuum full' also time out. (I am planning to post a separate
question regarding my vacuuming strategy).
- *WAL Configuration*: The WAL is in the same disk.
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2014-04-27 12:41:02 | Re: pl/pgsql performance |
Previous Message | Tom Lane | 2014-04-25 14:36:44 | Re: Poor performance for delete query |