From: | "Pweaver (Paul Weaver)" <pweaver(at)panjiva(dot)com> |
---|---|
To: | Jeff Janes <jeff(dot)janes(at)gmail(dot)com> |
Cc: | "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>, Michael Vezza <michael(at)panjiva(dot)com> |
Subject: | Re: Postgres Query Plan Live Lock |
Date: | 2014-02-05 19:47:53 |
Message-ID: | CAFTGa=kU2aTRQVdW2Wwg4XUYxzU0_7LCWEYfhGBAm3n5+7Yyyw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Wed, Feb 5, 2014 at 9:52 AM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
> On Monday, February 3, 2014, Pweaver (Paul Weaver) <pweaver(at)panjiva(dot)com>
> wrote:
>
>> We have been running into a (live lock?) issue on our production Postgres
>> instance causing queries referencing a particular table to become extremely
>> slow and our application to lock up.
>>
>> This tends to occur on a particular table that gets a lot of queries
>> against it after a large number of deletes. When this happens, the
>> following symptoms occur when queries referencing that table are run (even
>> it we stop the deleting):
>>
>
> What do you mean by "stop the deleting"? Are you pausing the delete but
> without either committing or rolling back the transaction, but just holding
> it open? Are you stopping it cleanly, between transactions?
>
We are repeatedly running delete commands in their own transactions. We
stop issuing new deletes and let them finish cleanly.
>
> Also, how many queries are happening concurrently? Perhaps you need a
> connection pooler.
>
Usually between 1 and 20. When it gets locked up closer to 100-200.
We should add a connection pooler. Would the number of active queries on
the table be causing the issue?
>
> Is the CPU time user time or system time? What kernel version do you have?
>
Real time - 3.2.0-26
>
>
>> SELECT * FROM table_name LIMIT 10; -- takes ~45 seconds to complete
>> EXPLAIN SELECT * FROM table_name LIMIT 10; -- takes ~45 seconds to
>> complete the explain query, the query plan looks reasonable
>>
>
> This sounds like the problem we heard quite a bit about recently, where
> processes spend a lot of time fighting over the proclock while they try to
> check the commit status of open transactions while. But I don't see how
> deletes could trigger that behavior. If the delete has not committed, the
> tuples are still visible and the LIMIT 10 is quickly satisfied. If the
> delete has committed, the tuples quickly get hinted, and so the next query
> along should be faster.
>
> I also don't see why the explain would be slow. A similar problem was
> tracked down to digging through in-doubt tuples while trying to use an
> index to find the true the min or max during estimating the cost of a merge
> join. But I don't think a simple table query should lead to that, unless
> table_name is a view. And I don't see how deletes, rather than uncommitted
> inserts, could trigger it either.
>
>
> max_connections | 600 |
>> configuration file
>>
>
> That is quite extreme. If a temporary load spike (like from the deletes
> and the hinting needed after them) slows down the select queries and you
> start more and more of them, soon you could tip the system over into kernel
> scheduler insanity with high system time. Once in this mode, it will stay
> there until the incoming stream of queries stops and the existing ones
> clear out. But, if that is what is occurring, I don't know why queries on
> other tables would still be fast.
>
We probably want a connection pooler anyways, but in this particular case,
the load average is fairly low on the machine running Postrgres.
>
> Cheers,
>
> Jeff
>
>>
--
Thank You,
Pweaver (pweaver(at)panjiva(dot)com)
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2014-02-05 21:19:20 | Re: [PERFORM] encouraging index-only scans |
Previous Message | Pweaver (Paul Weaver) | 2014-02-05 19:36:47 | Re: Postgres Query Plan Live Lock |