Quick Links

Re: measuring lwlock-related latency spikes

From:	Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
To:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc:	Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Greg Stark <stark(at)mit(dot)edu>, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: measuring lwlock-related latency spikes
Date:	2012-04-02 21:13:00
Message-ID:	CAMkU=1xa9DBbscMUVyD+2KNAFLAAmTO+1dqQwPzJvei97SA7YQ@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Mon, Apr 2, 2012 at 12:04 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> Long story short, when a CLOG-related stall happens,
>> essentially all the time is being spent in this here section of code:
>
>> /*
>> * If not part of Flush, need to fsync now. We assume this happens
>> * infrequently enough that it's not a performance issue.
>> */
>> if (!fdata) // fsync and close the file
>
> Seems like basically what you've proven is that this code path *is* a
> performance issue, and that we need to think a bit harder about how to
> avoid doing the fsync while holding locks.

And why is the fsync needed at all upon merely evicting a dirty page
so a replacement can be loaded?

If the system crashes between the write and the (eventual) fsync, you
are in the same position as if the system crashed while the page was
dirty in shared memory. Either way, you have to be able to recreate
it from WAL, right?

Cheers,

Jeff

In response to

Re: measuring lwlock-related latency spikes at 2012-04-02 19:04:14 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Robert Haas	2012-04-02 21:25:46	Re: measuring lwlock-related latency spikes
Previous Message	Greg Sabino Mullane	2012-04-02 20:09:32	Re: libxml related crash on git head