Re: Waiting on ExclusiveLock on extension 9.3, 9.4 and 9.5

From: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To: Tom Dearman <tom(dot)dearman(at)gmail(dot)com>
Cc: "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: Waiting on ExclusiveLock on extension 9.3, 9.4 and 9.5
Date: 2015-10-28 19:20:40
Message-ID: CAMkU=1z=hr5vmzk6_0FjbsuedMpJ+w83ECuPMR94g3sDRczoKA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Wed, Oct 28, 2015 at 8:43 AM, Tom Dearman <tom(dot)dearman(at)gmail(dot)com> wrote:
> We have a performance problem when our postgres is under high load. The CPU
> usage is very low, we have 48 cores for our postgres and the idle time
> averages at 90%. The problem is we get spikes in our transaction times
> which don’t appear with any obvious regularity and when we get the larger
> spikes, if I look in the postgres log we see that there is locking on
> 'process 41915 acquired ExclusiveLock on extension of relation 27177 of
> database 26192’. The actual relation changes one time it might be one table
> and another time another, though they are always big tables. I have looked
> at various previous threads and the only suggestions are either that the
> disk io is maxed out, which from our observations we don’t believe is the
> case for us,

What are those observations? Keep in mind that if 20 processes are
all trying to extend the relation at the same time, one will block on
IO (according to top/sar/vmstat etc.) and the other 19 will block on
that first one on a PostgreSQL heavy-weight lock. So all 20 of them
are effectively blocked on IO, but system monitoring tools won't know
that.

Also, the IO spikes will be transient, so any monitoring that
over-averages will not pick up on them.

> or that ‘shared_buffers’ is to large - so we have reduced this
> right down to 1G. In the previous threads there was an indication that the
> underlying problem was a lock which I believe has been either removed or
> much improved in 9.5 (see Lock scalability improvements), however we have
> not seen any improvement in the relation extension locking problem that we
> see. The version of 9.5 that we have tested is beta1. Any help in showing
> us how to improve this would be greatly appreciated.

I don't believe any of the improvements made were to this area.

Your latency spikes seem to be happening at a 20 minute interval.
That would make me think they are lined up with end-of-checkpoint
fsync activity, except those should be happening every 5 minutes as
your conf has not changed checkpoint_timeout away from the default.
Since you have log_checkpoints on, what do you see in the log files
about how often they occur, and what the checkpoint write time, sync
time, etc. are?

Cheers,

Jeff

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Edson Richter 2015-10-28 19:43:26 Re: PostgreSQL Timezone and Brazilian DST
Previous Message Mike 2015-10-28 19:04:50 regexp_replace to remove sql comments