From: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
---|---|
To: | Tom Dearman <tom(dot)dearman(at)gmail(dot)com> |
Cc: | "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: Waiting on ExclusiveLock on extension 9.3, 9.4 and 9.5 |
Date: | 2015-10-28 19:20:40 |
Message-ID: | CAMkU=1z=hr5vmzk6_0FjbsuedMpJ+w83ECuPMR94g3sDRczoKA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Wed, Oct 28, 2015 at 8:43 AM, Tom Dearman <tom(dot)dearman(at)gmail(dot)com> wrote:
> We have a performance problem when our postgres is under high load. The CPU
> usage is very low, we have 48 cores for our postgres and the idle time
> averages at 90%. The problem is we get spikes in our transaction times
> which don’t appear with any obvious regularity and when we get the larger
> spikes, if I look in the postgres log we see that there is locking on
> 'process 41915 acquired ExclusiveLock on extension of relation 27177 of
> database 26192’. The actual relation changes one time it might be one table
> and another time another, though they are always big tables. I have looked
> at various previous threads and the only suggestions are either that the
> disk io is maxed out, which from our observations we don’t believe is the
> case for us,
What are those observations? Keep in mind that if 20 processes are
all trying to extend the relation at the same time, one will block on
IO (according to top/sar/vmstat etc.) and the other 19 will block on
that first one on a PostgreSQL heavy-weight lock. So all 20 of them
are effectively blocked on IO, but system monitoring tools won't know
that.
Also, the IO spikes will be transient, so any monitoring that
over-averages will not pick up on them.
> or that ‘shared_buffers’ is to large - so we have reduced this
> right down to 1G. In the previous threads there was an indication that the
> underlying problem was a lock which I believe has been either removed or
> much improved in 9.5 (see Lock scalability improvements), however we have
> not seen any improvement in the relation extension locking problem that we
> see. The version of 9.5 that we have tested is beta1. Any help in showing
> us how to improve this would be greatly appreciated.
I don't believe any of the improvements made were to this area.
Your latency spikes seem to be happening at a 20 minute interval.
That would make me think they are lined up with end-of-checkpoint
fsync activity, except those should be happening every 5 minutes as
your conf has not changed checkpoint_timeout away from the default.
Since you have log_checkpoints on, what do you see in the log files
about how often they occur, and what the checkpoint write time, sync
time, etc. are?
Cheers,
Jeff
From | Date | Subject | |
---|---|---|---|
Next Message | Edson Richter | 2015-10-28 19:43:26 | Re: PostgreSQL Timezone and Brazilian DST |
Previous Message | Mike | 2015-10-28 19:04:50 | regexp_replace to remove sql comments |