From: | Greg Smith <greg(at)2ndquadrant(dot)com> |
---|---|
To: | Herouth Maoz <herouth(at)unicell(dot)co(dot)il> |
Cc: | Craig Ringer <craig(at)postnewspapers(dot)com(dot)au>, pgsql-general(at)postgresql(dot)org |
Subject: | Re: stopping processes, preventing connections |
Date: | 2010-03-17 18:16:22 |
Message-ID: | 4BA11C76.8070605@2ndquadrant.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
Herouth Maoz wrote:
> Aren't socket writes supposed to have time outs of some sort? Stupid policies notwithstanding, processes on the client side can disappear for any number of reasons - bugs, power failures, whatever - and this is not something that is supposed to cause a backend to hang, I would assume.
>
Note that you're not in the PostgreSQL code at the point where this is
stuck at--you're deep in the libc socket code. Making sure that sockets
will always have well behaved behavior at the OS level is not always
possible, due to the TPC/IP's emphasis on robust delivery. See section
2.8 "Why does it take so long to detect that the peer died?" at
http://www.faqs.org/faqs/unix-faq/socket/ for some background here, and
note that the point you're stuck in is inside of keepalive handling in
the database trying to do the right thing here.
As a general commentary on this area, in most cases where I've seen an
unkillable backend, which usually becomes noticed when the server won't
shutdown, have resulted from bad socket behavior. It's really a tricky
area to get right, and presuming the database backends will be robust in
the case of every possible weird OS behavior is hard to guarantee.
However, if you can repeatably get the server into this bad state at
will, it may be worth spending some more time digging into this in hopes
there is something valuable to learn about your situation that can
improve the keepalive handling on the server side. Did you mention your
PostgreSQL server version and platform? I didn't see the exact code
path you're stuck in during a quick look at the code involved (using a
snapshot of recent development), which makes me wonder if this isn't
already a resolved problem in a newer version.
--
Greg Smith 2ndQuadrant US Baltimore, MD
PostgreSQL Training, Services and Support
greg(at)2ndQuadrant(dot)com www.2ndQuadrant.us
From | Date | Subject | |
---|---|---|---|
Next Message | Stuart McGraw | 2010-03-17 18:27:50 | building a c function |
Previous Message | Tom Lane | 2010-03-17 14:50:44 | Re: stopping processes, preventing connections |