Re: killing pg_dump leaves backend process

From: Greg Stark <stark(at)mit(dot)edu>
To: Christopher Browne <cbbrowne(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Tatsuo Ishii <ishii(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: killing pg_dump leaves backend process
Date: 2013-08-11 02:24:35
Message-ID: CAM-w4HP5=EEHtM6-FgV2=mCohYK6RJzgsvSAg7GM20KuNNKwFw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I think this is utterly the won't way to think about this.

TCP is designed to be robust against transient network outages. They are
*not* supposed to cause disconnections. The purpose of keepalives is to
detect connections that are still valid live connections that are stale and
the remote end is not longer present for.

Keepalives that trigger on the timescale of less than several times the msl
are just broken and make TCP unreliable. That means they cannot trigger in
less than many minutes.

This case is one that should just work and should work immediately. From
the users point of view when a client cleanly dies the kernel on the client
is fully aware of the connection being closed and the network is working
fine. The server should be aware the client has gone away *immediately*.
There's no excuse for any polling or timeouts.

--
greg
On 10 Aug 2013 17:30, "Christopher Browne" <cbbrowne(at)gmail(dot)com> wrote:

> On Sat, Aug 10, 2013 at 12:30 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > Tatsuo Ishii <ishii(at)postgresql(dot)org> writes:
> >> I noticed pg_dump does not exit gracefully when killed.
> >> start pg_dump
> >> kill pg_dump by ctrl-c
> >> ps x
> >
> >> 27246 ? Ds 96:02 postgres: t-ishii dbt3 [local] COPY
> >> 29920 ? S 0:00 sshd: ishii(at)pts/5
> >> 29921 pts/5 Ss 0:00 -bash
> >> 30172 ? Ss 0:00 postgres: t-ishii dbt3 [local] LOCK TABLE
> waiting
> >
> >> As you can see, after killing pg_dump, a backend process is (LOCK
> >> TABLE waiting) left behind. I think this could be easily fixed by
> >> adding signal handler to pg_dump so that it catches the signal and
> >> issues a query cancel request.
> >
> > If we think that's a problem (which I'm not convinced of) then pg_dump
> > is the wrong place to fix it. Any other client would behave the same
> > if it were killed while waiting for some backend query. So the right
> > fix would involve figuring out a way for the backend to kill itself
> > if the client connection goes away while it's waiting.
>
> This seems to me to be quite a bit like the TCP keepalive issue.
>
> We noticed with Slony that if something ungraceful happens in the
> networking layer (the specific thing noticed was someone shutting off
> networking, e.g. "/etc/init.d/networking stop" before shutting down
> Postgres+Slony), the usual timeouts are really rather excessive, on
> the order of a couple hours.
>
> Probably it would be desirable to reduce the timeout period, so that
> the server could figure out that clients are incommunicado "reasonably
> quickly." It's conceivable that it would be apropos to diminish the
> timeout values in postgresql.conf, or at least to recommend that users
> consider doing so.
> --
> When confronted by a difficult problem, solve it by reducing it to the
> question, "How would the Lone Ranger handle this?"
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2013-08-11 05:31:36 Re: dynamic background workers, round two
Previous Message Noah Misch 2013-08-11 02:10:17 Re: killing pg_dump leaves backend process