Re: Overflow of bgwriter's request queue

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: ITAGAKI Takahiro <itagaki(dot)takahiro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Overflow of bgwriter's request queue
Date: 2006-01-11 18:15:54
Message-ID: 20216.1137003354@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

ITAGAKI Takahiro <itagaki(dot)takahiro(at)lab(dot)ntt(dot)co(dot)jp> writes:
> I encountered overflow of bgwriter's file-fsync request queue. It occurred
> during checkpoints. Each backend would call fsync disorderly in such cases,
> so that the checkpoint takes a long time and the performance has decreased.
> It seems to happen frequently on the machines with a lot of memories and
> poor disks.

I can't help thinking that this is a situation that could only be got
into with a seriously misconfigured database --- per the comments for
ForwardFsyncRequest, we really don't want this code to run at all,
let alone run so often that a queue with NBuffers entries overflows.
What exactly are the test conditions under which you're seeing this
happen?

If there actually is a problem that needs to be solved, I think it'd be
better to try to do AbsorbFsyncRequests somewhere in the main checkpoint
loops. I don't like the idea of holding the BgWriterCommLock long
enough to do a qsort ... especially not if this occurs only with very
large NBuffers settings. Also, what if the qsort fails to eliminate any
duplicates, or eliminates only a few? You could get into a scenario
where the qsort gets repeated every few ForwardFsyncRequest calls, in
which case it'd become a drag on performance itself. (See also recent
discussion with Qingqing about converting BgWriterCommLock to a
spinlock. Though I was against that because no performance problem had
been shown, it could still become something we want to do ... but
putting a qsort here would foreclose that option.)

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2006-01-11 20:28:41 Re: FW: Intermittent Stats Failiures: firefly: HEAD
Previous Message Larry Rosenman 2006-01-11 16:10:47 FW: Intermittent Stats Failiures: firefly: HEAD