Re: Intermittent hangs with 9.2

From: David Whittaker <dave(at)iradix(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: PostgreSQL Performance <pgsql-performance(at)postgresql(dot)org>, Andres Freund <andres(at)2ndquadrant(dot)com>
Subject: Re: Intermittent hangs with 9.2
Date: 2013-09-20 18:11:12
Message-ID: CABXnLXQU9S1Jd-dwpysTkMdtwETkEUGn4a8XiHUnc98AKXDbAw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

We haven't seen any issues since we decreased shared_buffers. We also
tuned some of the longer running / more frequently executed queries, so
that may have had an effect as well, but my money would be on the
shared_buffers change. If the issue re-appears I'll try to get a perf
again and post back, but if you don't hear from me again you can assume the
problem is solved.

Thank you all again for the help.

-Dave

On Fri, Sep 13, 2013 at 11:05 AM, David Whittaker <dave(at)iradix(dot)com> wrote:

>
>
>
> On Fri, Sep 13, 2013 at 10:52 AM, Merlin Moncure <mmoncure(at)gmail(dot)com>wrote:
>
>> On Thu, Sep 12, 2013 at 3:06 PM, David Whittaker <dave(at)iradix(dot)com> wrote:
>> > Hi All,
>> >
>> > We lowered shared_buffers to 8G and increased effective_cache_size
>> > accordingly. So far, we haven't seen any issues since the adjustment.
>> The
>> > issues have come and gone in the past, so I'm not convinced it won't
>> crop up
>> > again, but I think the best course is to wait a week or so and see how
>> > things work out before we make any other changes.
>> >
>> > Thank you all for your help, and if the problem does reoccur, we'll look
>> > into the other options suggested, like using a patched postmaster and
>> > compiling for perf -g.
>> >
>> > Thanks again, I really appreciate the feedback from everyone.
>>
>> Interesting -- please respond with a follow up if/when you feel
>> satisfied the problem has gone away. Andres was right; I initially
>> mis-diagnosed the problem (there is another issue I'm chasing that has
>> a similar performance presentation but originates from a different
>> area of the code).
>>
>> That said, if reducing shared_buffers made *your* problem go away as
>> well, then this more evidence that we have an underlying contention
>> mechanic that is somehow influenced by the setting. Speaking frankly,
>> under certain workloads we seem to have contention issues in the
>> general area of the buffer system. I'm thinking (guessing) that the
>> problems is usage_count is getting incremented faster than the buffers
>> are getting cleared out which is then causing the sweeper to spend
>> more and more time examining hotly contended buffers. This may make
>> no sense in the context of your issue; I haven't looked at the code
>> yet. Also, I've been unable to cause this to happen in simulated
>> testing. But I'm suspicious (and dollars to doughnuts '0x347ba9' is
>> spinlock related).
>>
>> Anyways, thanks for the report and (hopefully) the follow up.
>>
>> merlin
>>
>
> You guys have taken the time to help me through this, following up is the
> least I can do. So far we're still looking good.
>

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Jeff Janes 2013-09-20 22:01:33 Re: Planner performance extremely affected by an hanging transaction (20-30 times)?
Previous Message Robert Haas 2013-09-20 15:58:46 Re: [PERFORM] encouraging index-only scans