From: | Magnus Hagander <magnus(at)hagander(dot)net> |
---|---|
To: | Florian Pflug <fgp(at)phlo(dot)org> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Bug in walsender when calling out to do_pg_stop_backup (and others?) |
Date: | 2011-10-06 19:48:55 |
Message-ID: | CABUevEw48PsBYtj+UjP4r+3aQN=aH6RgpMw6Qxpb+Ra02BUn5Q@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, Oct 6, 2011 at 14:34, Florian Pflug <fgp(at)phlo(dot)org> wrote:
> On Oct5, 2011, at 15:30 , Magnus Hagander wrote:
>> When walsender calls out to do_pg_stop_backup() (during base backups),
>> it is not possible to terminate the process with a SIGTERM - it
>> requires a SIGKILL. This can leave unkillable backends for example if
>> archive_mode is on and archive_command is failing (or not set). A
>> similar thing would happen in other cases if walsender calls out to
>> something that would block (do_pg_start_backup() for example), but the
>> stop one is easy to provoke.
>
> Hm, this seems to be related to another buglet I noticed a while ago,
> but then forgot about again. If one terminates pg_basebackup while it's
> waiting for all required WAL to be archived, the backend process only
> exits once that waiting phase is over. If, like in your failure case,
> archive_command fails indefinity (or isn't set), the backend process
> stays around forever.
Yes.
> Your patch would improve that only insofar as it'd at least allow an
> immediate shutdown request to succeed - as it stands, that doesn't work
> because, as you mentioned, the blocked walsender doesn't handle SIGTERM.
Exactly.
> The question is, should we do more? To me, it'd make sense to terminate
> a backend once it's connection is gone. We could, for example, make
> pq_flush() set a global flag, and make CHECK_FOR_INTERRUPTS handle a
> broken connection that same way as a SIGINT or SIGTERM.
The problem here is that we're hanging at a place where we don't touch
the socket. So we won't notice the socket is gone. We'd have to do a
select() or something like that at regular intervals to make sure it's
there, no?
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
From | Date | Subject | |
---|---|---|---|
Next Message | Magnus Hagander | 2011-10-06 19:49:24 | Re: Bug in walsender when calling out to do_pg_stop_backup (and others?) |
Previous Message | Robert Haas | 2011-10-06 19:46:07 | Re: index-only scans |