Re: Slow queries on 9.3.1 despite use of index

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Michael van Rooyen <michael(at)loot(dot)co(dot)za>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Slow queries on 9.3.1 despite use of index
Date: 2014-04-28 17:52:41
Message-ID: CAMkU=1xJE1e_gZywtwHCPJGVnbt7zLWXcSgyyMRAa0ueCYi4sg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Mon, Apr 28, 2014 at 10:12 AM, Michael van Rooyen <michael(at)loot(dot)co(dot)za>wrote:

> I'm trying to get to the bottom of a performance issue on a server running
> PostgreSQL 9.3.1 on Centos 5. The machine is a dual quad-core Xeon E5620
> with 24GB ECC RAM and four enterprise SATA Seagate Constellation ES drives
> configured as 2 software RAID1 volumes. The main DB is on one volume and
> some, more used indexes and WAL logs are on the other.
>
> Overall, the load doesn't appear to be heavy, but we're still getting slow
> queries, for example, here's an extract from the log:
>
> 2014-04-28 16:51:02.904 GMT 25998: LOG: checkpoint starting: time
> 2014-04-28 16:53:37.204 GMT 30053: LOG: duration: 1067.464 ms execute
> <unnamed>: select "id" from "ProductSupplier" where
> "product"='25553848082928'
> 2014-04-28 16:54:12.701 GMT 30053: LOG: duration: 1105.844 ms execute
> <unnamed>: select "id" from "ProductSupplier" where
> "product"='1626407082928'
> 2014-04-28 16:54:46.789 GMT 30053: LOG: duration: 1060.585 ms execute
> <unnamed>: select

...

> 2014-04-28 16:55:58.058 GMT 30053: LOG: duration: 1309.192 ms execute
> <unnamed>: select "id" from "ProductCategory" where
> "product"='1932872082928'
> 2014-04-28 16:56:06.019 GMT 25998: LOG: checkpoint complete: wrote 6647
> buffers (2.5%); 0 transaction log file(s) added, 0 removed, 2 recycled;
> write=298.862 s, sync=4.072 s, total=303.115 s; sync files=155,
> longest=0.234 s, average=0.026 s
>

It looks like something is causing your IO to seize up briefly. It is
common for the sync phase of the checkpoint to do that, but that would only
explain 3 of the 4 reports above.

Is this causing an actual problem for your users, or are you just trying to
be proactive?

You could change the kernel setting dirty_background_bytes to try to reduce
this problem.

> Overall the load on the server seems quite low, for example, typical
> vmstat -1 is:
>

> procs -----------memory---------- ---swap-- -----io---- --system--
> -----cpu------
> r b swpd free buff cache si so bi bo in cs us sy id
> wa st
> 0 1 304 77740 11960 17709156 0 0 99 82 2 2 2 1
> 89 8 0
> 1 0 304 75228 11960 17711164 0 0 1256 635 1418 6498 0 0
> 94 6 0
> 0 1 304 73440 11968 17712036 0 0 1232 149 1253 6232 1 0
> 94 6 0
> 0 2 304 78472 11968 17706016 0 0 1760 89 1325 6361 1 0
> 94 5 0
> 0 1 304 75616 11968 17708480 0 0 2164 72 1371 6855 1 0
> 94 5 0
> 0 1 304 73292 11968 17710320 0 0 1760 112 1335 6673 1 0
> 94 5 0
> 0 2 304 77956 11964 17703528 0 0 1204 5614 1867 6712 0 0
> 94 6 0
>

It that typical for when the problem is not occurring, or typical for when
it is occurring. Without having timestamps to correlate the vmstat back to
log file, it is very hard to make use of this info. Some versions of
vmstat have a -t flag.

>
>
> I've tried to optimise postgresql.conf for performance:
>
> max_connections = 1000 # (change requires restart)
>

1000 is extremely high. How many connections do you actually use at any
one time?

> shared_buffers = 2GB # min 128kB or max_connections*16kB
> work_mem = 100MB # min 64kB
>

100MB is also very high, at least on conjunction with the high
max_connections.

> maintenance_work_mem = 100MB # min 1MB
> synchronous_commit = off # immediate fsync at commit
> wal_buffers = 16MB # min 32kB
> checkpoint_segments = 64 # in logfile segments, min 1, 16MB
> each
> checkpoint_timeout = 10min # range 30s-1h
> effective_cache_size = 16GB
> logging_collector = on # Enable capturing of stderr and
> csvlog
> log_directory = 'pg_log' # directory where log files are
> written,
> log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # log file name pattern,
> log_rotation_age = 1d # Automatic rotation of logfiles
> will
> log_min_duration_statement = 1000 # -1 is disabled, 0 logs all
> statements
>

I would lower this. You can see that few statements were just over 1000
ms, but can't tell if there are lot that are at 800 ms, or if you have
bimodal distribution with most being 1ms and a few being 1200ms.

Cheers,

Jeff

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Heikki Linnakangas 2014-04-28 18:04:02 Re: Revisiting disk layout on ZFS systems
Previous Message Tom Lane 2014-04-28 17:50:20 Re: Slow queries on 9.3.1 despite use of index