Re: Problem with ExclusiveLock on inserts

From: Ilya Kosmodemiansky <hydrobiont(at)gmail(dot)com>
To: Бородин Владимир <root(at)simply(dot)name>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: Problem with ExclusiveLock on inserts
Date: 2014-02-12 19:37:11
Message-ID: CABvo6pStCUS=szPf4bmD8EX1WDEZGtxL24Ua2R6Yv=A913x+Yw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

another thing which is arguable - concurrency degree. How many of your
max_connections = 4000 are actually running? 4000 definitely looks like an
overkill and they could be a serious source of concurrency, especially then
you have had barrier enabled and software raid.

Plus for 32Gb of shared buffers with synchronous_commit = on especially on
heavy workload one should definitely have bbu, otherwise performance will
be poor.

On Wed, Feb 12, 2014 at 8:20 PM, Бородин Владимир <root(at)simply(dot)name> wrote:

> Oh, I haven't thought about barriers, sorry. Although I use soft raid
> without batteries I have turned barriers off on one cluster shard to try.
>
> root(at)rpopdb01e ~ # mount | fgrep data
> /dev/md2 on /var/lib/pgsql/9.3/data type ext4 (rw,noatime,nodiratime)
> root(at)rpopdb01e ~ # mount -o remount,nobarrier /dev/md2
> root(at)rpopdb01e ~ # mount | fgrep data
> /dev/md2 on /var/lib/pgsql/9.3/data type ext4
> (rw,noatime,nodiratime,nobarrier)
> root(at)rpopdb01e ~ #
>
> 12.02.2014, в 21:56, Ilya Kosmodemiansky <hydrobiont(at)gmail(dot)com>
> написал(а):
>
> My question was actually about barrier option, by default it is enabled on
> RHEL6/ext4 and could cause serious bottleneck on io before disks are
> actually involved. What says mount without arguments?
>
> On Feb 12, 2014, at 18:43, Бородин Владимир <root(at)simply(dot)name> wrote:
>
> root(at)rpopdb01e ~ # fgrep data /etc/fstab
> UUID=f815fd3f-e4e4-43a6-a6a1-bce1203db3e0 /var/lib/pgsql/9.3/data ext4
> noatime,nodiratime 0 1
> root(at)rpopdb01e ~ #
>
> According to iostat the disks are not the bottleneck.
>
> 12.02.2014, в 21:30, Ilya Kosmodemiansky <hydrobiont(at)gmail(dot)com>
> написал(а):
>
> Hi Vladimir,
>
> Just in case: how is your ext4 mount?
>
> Best regards,
> Ilya
>
> On Feb 12, 2014, at 17:59, Бородин Владимир <root(at)simply(dot)name> wrote:
>
> Hi all.
>
> Today I have started getting errors like below in logs (seems that I have
> not changed anything for last week). When it happens the db gets lots of
> connections in state active, eats 100% cpu and clients get errors (due to
> timeout).
>
> 2014-02-12 15:44:24.562
> MSK,"rpop","rpopdb_p6",30061,"localhost:58350",52fb5e53.756d,1,"SELECT
> waiting",2014-02-12 15:43:15 MSK,143/264877,1002850566,LOG,00000,"process
> 30061 still waiting for ExclusiveLock on extension of relation 26118 of
> database 24590 after 1000.082 ms",,,,,"SQL statement ""insert into
> rpop.rpop_imap_uidls (folder_id, uidl) values (i_folder_id, i_uidl)""
>
> I have read several topics [1, 2, 3, 4] with similar problems but haven't
> find a good solution. Below is some more diagnostics.
>
> I am running PostgreSQL 9.3.2 installed from RPM packages on RHEL 6.4.
> Host is running with the following CPU (32 cores) and memory:
>
> root(at)rpopdb01e ~ # fgrep -m1 'model name' /proc/cpuinfo
> model name : Intel(R) Xeon(R) CPU E5-2660 0 @ 2.20GHz
> root(at)rpopdb01e ~ # free -m
> total used free shared buffers cached
> Mem: 129028 123558 5469 0 135 119504
> -/+ buffers/cache: 3918 125110
> Swap: 16378 0 16378
> root(at)rpopdb01e ~ #
>
> PGDATA lives on RAID6 array of 8 ssd-disks with ext4, iostat and atop say
> the disks are really free. Right now PGDATA takes only 95G.
> The settings changed in postgresql.conf are here [5].
>
> When it happens the last query from here [6] shows that almost all queries
> are waiting for ExclusiveLock, but they do a simple insert.
>
> (extend,26647,26825,,,,,,,) | 5459 | ExclusiveLock | 1 |
> (extend,26647,26825,,,,,,,) | 8053 | ExclusiveLock | 5459,8053
> (extend,26647,26828,,,,,,,) | 5567 | ExclusiveLock | 1 |
> (extend,26647,26828,,,,,,,) | 5490 | ExclusiveLock | 5567,5490
> (extend,24584,25626,,,,,,,) | 5611 | ExclusiveLock | 1 |
> (extend,24584,25626,,,,,,,) | 3963 | ExclusiveLock | 5611,3963
>
> I have several databases running on one host with one postmaster process
> and ExclusiveLock is being waited by many oids. I suppose the only common
> thing for all of them is that they are bigger than others and they almost
> do not get updates and deletes (only inserts and reads). Some more info
> about one of such tables is here [7].
>
> I have tried to look at the source code (src/backend/access/heap/hio.c) to
> understand when the exclusive lock can be taken, but I could only read
> comments :) I have also examined FSM for this tables and their indexes and
> found that for most of them there are free pages but there are, for
> example, such cases:
>
> rpopdb_p0=# select count(*) from pg_freespace('rpop.rpop_uidl') where
> avail != 0;
> count
> --------
> 115953
> (1 row)
>
> rpopdb_p0=# select count(*) from pg_freespace('rpop.pk_rpop_uidl') where
> avail != 0;
> count
> -------
> 0
> (1 row)
>
> rpopdb_p0=# \dS+ rpop.rpop_uidl
> Table "rpop.rpop_uidl"
> Column | Type | Modifiers | Storage | Stats target |
> Description
>
> --------+------------------------+-----------+----------+--------------+-------------
> popid | bigint | not null | plain | |
> uidl | character varying(200) | not null | extended | |
> Indexes:
> "pk_rpop_uidl" PRIMARY KEY, btree (popid, uidl)
> Has OIDs: no
>
> rpopdb_p0=#
>
>
> My questions are:
> 1. Do we consume 100% cpu (in system) trying to get page from FSM? Or does
> it happen during exclusive lock acquiring? How can I dig it?
> 2. How much space do we extend to the relation when we get exclusive lock
> on it?
> 3. Why extended page is not visible for other backends?
> 4. Is there any possibility of situation where backend A got exclusive
> lock on some relation to extend it. Then OS CPU scheduler made a context
> switch to backend B while backend B is waiting for exclusive lock on the
> same relation. And so on for many backends.
> 5. (and the main question) what can I do to get rid of such situations? It
> is a production cluster and I do not have any ideas what to do with this
> situation :( Any help would be really appropriate.
>
> [1]
> http://www.postgresql.org/message-id/8bca3aa10906011613l8ac2423h8153bbd2513dc159@mail.gmail.com
> [2]
> http://pgsql.performance.narkive.com/IrkPbl3f/postgresql-9-2-3-performance-problem-caused-exclusive-locks
> [3] http://www.postgresql.org/message-id/50A2C93E.9070607@dalibo.com
> [4]
> http://www.postgresql.org/message-id/CAL_0b1sypYeOyNkYNV95nNV2d+4jXTug3HkKF6FahfW7Gvgb_Q@mail.gmail.com
> [5] http://pastebin.com/raw.php?i=Bd40Vn6h
> [6] http://wiki.postgresql.org/wiki/Lock_dependency_information
> [7 http://pastebin.com/raw.php?i=eGrtG524]
>
> --
> Vladimir
>
>
>
>
>
>
> --
> Vladimir
>
>
>
>
>
>
> --
> Да пребудет с вами сила...
> http://simply.name
>
>
>
>
>
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Бородин Владимир 2014-02-12 19:57:49 Re: Problem with ExclusiveLock on inserts
Previous Message Бородин Владимир 2014-02-12 19:20:02 Re: Problem with ExclusiveLock on inserts