Re: BUG #13643: Should a process dying bring postgresql down, or not?

From: Amir Rohan <amir(dot)rohan(at)mail(dot)com>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #13643: Should a process dying bring postgresql down, or not?
Date: 2015-09-28 21:42:00
Message-ID: 5609B428.6020006@mail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 09/28/2015 12:06 AM, Alvaro Herrera wrote:
> Amir Rohan wrote:
>> On 09/27/2015 09:59 PM, Alvaro Herrera wrote:
>>> amir(dot)rohan(at)mail(dot)com wrote:
>>>
>>>> postgres 2181 0.0 0.1 134468 9504 pts/0 T 03:34 0:00 /usr/local/pgsql/bin/postgres -D /home/local/pg/s1
>>>> postgres 2183 0.0 0.0 134576 4168 ? Ss 03:34 0:00 postgres: checkpointer process
>>>> postgres 2184 0.0 0.0 134604 2844 ? Ss 03:34 0:00 postgres: writer process
>>>> postgres 2185 0.0 0.0 134468 2780 ? Ss 03:34 0:00 postgres: wal writer process
>>>> postgres 2186 0.0 0.0 0 0 ? Zs 03:34 0:00 [postgres] <defunct> <<<<<<<<<<<<<<< dead process
>>>> postgres 2187 0.0 0.0 127300 2204 ? Ss 03:34 0:00 postgres: stats collector process
>>>> postgres 2193 0.0 0.0 118164 2696 pts/0 T 03:34 0:00 pg_basebackup -D /home/local/pg/backup -p 57833 --format=t -x
>>>> postgres 2194 0.0 0.0 134916 6016 ? Ss 03:34 0:00 postgres: wal sender process user1 [local] sending backup "pg_basebackup base backup"
>>>
>>> That postmaster is in STOPped mode is the issue here. That doesn't
>>> happen unless you take specific action to do that.
>>
>> I hadn't noticed that. That looks like I suspended pg_ctl during start,
>> but with the backup in progress already, it's not clear how I managed
>> that state. There was no kill -SIGSTOP involved...
>
> Suspending a process *is* sending sigstop. You may not have sent
> sigstop explicitely, but the shell would have done it if you suspended
> the process.
>
> Since pg_ctl is not normally long-lived, I'm not sure how you ended up
> suspending it.
>
>> After killing some subprocesses in random I do see postgres
>> restarting the whole group once one goes down, if/once its
>> running/unsuspended.
>
> Well, doing things randomly is unlikely to teach you much ...
>

Pardon my earlier HTML response, I had to use the webmail interface at
the time. Sending again as text.

>
>
> Sent: Monday, September 28, 2015 at 12:06 AM
> From: "Alvaro Herrera" <alvherre(at)2ndquadrant(dot)com>
> To: "Amir Rohan" <amir(dot)rohan(at)mail(dot)com>
> Cc: pgsql-bugs(at)postgresql(dot)org
> Subject: Re: BUG #13643: Should a process dying bring postgresql down,
or not?

> Amir Rohan wrote:
>> On 09/27/2015 09:59 PM, Alvaro Herrera wrote:
>> > amir(dot)rohan(at)mail(dot)com wrote:
>> >
>> >> postgres 2181 0.0 0.1 134468 9504 pts/0 T 03:34 0:00
/usr/local/pgsql/bin/postgres -D /home/local/pg/s1
>> >> postgres 2183 0.0 0.0 134576 4168 ? Ss 03:34 0:00 postgres:
checkpointer process
>> >> postgres 2184 0.0 0.0 134604 2844 ? Ss 03:34 0:00 postgres: writer
process
>> >> postgres 2185 0.0 0.0 134468 2780 ? Ss 03:34 0:00 postgres: wal
writer process
>> >> postgres 2186 0.0 0.0 0 0 ? Zs 03:34 0:00 [postgres] <defunct>
<<<<<<<<<<<<<<< dead process
>> >> postgres 2187 0.0 0.0 127300 2204 ? Ss 03:34 0:00 postgres: stats
collector process
>> >> postgres 2193 0.0 0.0 118164 2696 pts/0 T 03:34 0:00 pg_basebackup
-D /home/local/pg/backup -p 57833 --format=t -x
>> >> postgres 2194 0.0 0.0 134916 6016 ? Ss 03:34 0:00 postgres: wal
sender process user1 [local] sending backup "pg_basebackup base backup"
>> >
>> > That postmaster is in STOPped mode is the issue here. That doesn't
>> > happen unless you take specific action to do that.
>>
>> I hadn't noticed that. That looks like I suspended pg_ctl during start,
>> but with the backup in progress already, it's not clear how I managed
>> that state. There was no kill -SIGSTOP involved...
>
> Suspending a process *is* sending sigstop. You may not have sent
> sigstop explicitely, but the shell would have done it if you suspended
> the process.
>

I *know*. But as you can see that backup process is already underway.
That means pg_ctl had returned by then, and I had issued the
pg_basebackup command. Since I didn't manually send a SIGSTOP,
and postgres was already detached by then, I don't know how it
could have gotten suspended.

> Since pg_ctl is not normally long-lived, I'm not sure how you ended up
> suspending it.
>

exactly.

>> After killing some subprocesses in random I do see postgres
>> restarting the whole group once one goes down, if/once its
>> running/unsuspended.

>
> Well, doing things randomly is unlikely to teach you much ...
>

Well, It can teach you which electric socket will
electrocute you when poked with a fork. That's useful data.

Amir

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Alvaro Herrera 2015-09-28 21:53:57 Re: BUG #13643: Should a process dying bring postgresql down, or not?
Previous Message Jeremy Whiting 2015-09-28 18:51:56 Re: BUG #13646: Upgrading existing db from 9.2 to 9.4.4 not working using postgresql-setup.