From: | Andres Freund <andres(at)2ndquadrant(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Christophe Pettus <xof(at)thebuild(dot)com>, PostgreSQL-development Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: "stuck spinlock" |
Date: | 2013-12-13 02:41:41 |
Message-ID: | 20131213024141.GF29402@awork2.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2013-12-12 21:15:29 -0500, Tom Lane wrote:
> Christophe Pettus <xof(at)thebuild(dot)com> writes:
> > On Dec 12, 2013, at 5:45 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >> Presumably, we are seeing the victim rather than the perpetrator of
> >> whatever is going wrong.
>
> > This is probing about a bit blindly, but the only thing I can see about this system that is in some way unique (and this is happening on multiple machines, so it's unlikely to be hardware) is that there are a relatively large number of relations (like, 440,000+) distributed over many schemas. Is there anything that pins a buffer that is O(N) to the number of relations?
>
> It's not a buffer *pin* that's at issue, it's a buffer header spinlock.
> And there are no loops, of any sort, that are executed while holding
> such a spinlock. At least not in the core PG code. Are you possibly
> using any nonstandard extensions?
It could maybe be explained by a buffer aborting while performing
IO. Until it has call AbortBufferIO(), other backends will happily loop
in WaitIO(), constantly taking the the buffer header spinlock and
locking io_in_progress_lock in shared mode, thereby preventing
AbortBufferIO() from succeeding.
Christophe: are there any "unusual" ERROR messages preceding the crash,
possibly some minutes before?
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | satoshi yamada | 2013-12-13 02:44:46 | Re: Why standby.max_connections must be higher than primary.max_connections? |
Previous Message | Peter Geoghegan | 2013-12-13 02:25:28 | Re: "stuck spinlock" |