From: | Christopher Opena <counterveil(at)gmail(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: Client-based EOFs triggering hung queries? |
Date: | 2011-05-16 21:01:42 |
Message-ID: | BANLkTim0ONH5ZNfnYV70CZxGvm0g0PEDNg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Thanks for the reply, Tom. I admit we were a bit rushed during the
troubleshooting process; now that we know precisely how to identify these
procs and deal with them, I imagine we'll grab more info next time before
killing them, including a netstat view and a stack trace per your
recommendation.
Cheers,
-Chris.
On Mon, May 16, 2011 at 1:04 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Christopher Opena <counterveil(at)gmail(dot)com> writes:
> > Hello all,
> > First time poster here - probably a good sign since I've been running
> > postgresql with zero issues for the last several years! At any rate,
> I've
> > recently run into a strange issue. Client-based EOFs are nothing new to
> our
> > application; people can sometimes close a connection for a number of
> > reasons. However, on the DB side these have always been released with no
> > issue.
>
> > Today, we had 4 connections that saw either a client-based EOF or a
> > 'canceling statement due to user request', and in all of these 4 cases
> the
> > query remained active and started chewing heavily into the cpu such that
> > they ground the db server to a halt until each of the procpids were
> manually
> > issued a 'pg_cancel_backed ( procpid );' from the console. None of the 4
> > queries were particularly heavy; they registered approximately a 2300ms
> > completion time using a query EXPLAIN ANALYZE. A little long, but not
> too
> > far out of the ordinary on our usual database usage.
>
> It's always been the case that if the client drops the connection
> mid-query, the backend will notice that only when it next tries to send
> some data --- and even then, if the kernel isn't aware that the far end
> has dropped the connection, the kernel may accept and buffer quite a bit
> of data before blocking the backend from doing more work. So if the
> query requires a lot of processing before it starts to emit data, the
> backend can do a lot of work before noticing anything is wrong.
> I suspect that these queries were expensive and you made a mistake in
> what you tested after the fact (or maybe conditions changed, eg
> autovacuum came along and updated statistics). It's also possible that
> some network glitch affected the connections such that the server's
> kernel didn't know they were lost. It's hard to say much more than that
> without a lot more evidence than you've provided. If it happens again,
> you might try poking a bit harder into the situation before you kill the
> queries --- netstat's opinion of the connection statuses might be
> interesting, and so might stack traces from the busy backends.
>
> regards, tom lane
>
From | Date | Subject | |
---|---|---|---|
Next Message | Gauthier, Dave | 2011-05-16 21:47:04 | Suppress "INSERT x x" messages |
Previous Message | Bosco Rama | 2011-05-16 20:58:38 | Re: Remove Modifiers on Table |