Quick Links

Re: issue with gininsert under very high load

From:	Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To:	Andres Freund <andres(at)2ndquadrant(dot)com>
Cc:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: issue with gininsert under very high load
Date:	2014-02-12 21:04:55
Message-ID:	52FBE1F7.6050204@vmware.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 02/12/2014 10:50 PM, Andres Freund wrote:
> On February 12, 2014 9:33:38 PM CET, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Andres Freund <andres(at)2ndquadrant(dot)com> writes:
>>> On 2014-02-12 14:39:37 -0500, Andrew Dunstan wrote:
>>>> On investigation I found that a number of processes were locked
>> waiting for
>>>> one wedged process to end its transaction, which never happened
>> (this
>>>> transaction should normally take milliseconds). oprofile revealed
>> that
>>>> postgres was spending 87% of its time in s_lock(), and strace on the
>> wedged
>>>> process revealed that it was in a tight loop constantly calling
>> select(). It
>>>> did not respond to a SIGTERM.
>>
>>> That's a deficiency of the gin fastupdate cache: a) it bases it's
>> size
>>> on work_mem which usually makes it *far* too big b) it doesn't
>> perform the
>>> cleanup in one go if it can get a suitable lock, but does independent
>>> locking for each entry. That usually leads to absolutely horrific
>>> performance under concurreny.
>>
>> I'm not sure that what Andrew is describing can fairly be called a
>> concurrent-performance problem. It sounds closer to a stuck lock.
>> Are you sure you've diagnosed it correctly?
>
> No. But I've several times seen similar backtraces where it wasn't actually stuck, just livelocked. I'm just on my mobile right now, but afair Andrew described a loop involving lots of semaphores and spinlock, that shouldn't be the case if it were actually stuck.
> If there dozens of processes waiting on the same lock, cleaning up a large amount of items one by one, it's not surprising if its dramatically slow.

Perhaps we should use a lock to enforce that only one process tries to
clean up the pending list at a time.

- Heikki

In response to

Re: issue with gininsert under very high load at 2014-02-12 20:50:22 from Andres Freund

Responses

Re: issue with gininsert under very high load at 2014-02-13 15:40:18 from Andrew Dunstan

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Peter Eisentraut	2014-02-12 21:23:21	Re: narwhal and PGDLLIMPORT
Previous Message	Andres Freund	2014-02-12 20:50:22	Re: issue with gininsert under very high load