From: | Dimitri Fontaine <dimitri(at)2ndQuadrant(dot)fr> |
---|---|
To: | Andres Freund <andres(at)2ndquadrant(dot)com> |
Cc: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: putting a bgworker to rest |
Date: | 2013-04-24 16:30:57 |
Message-ID: | m21u9z7uwu.fsf@2ndQuadrant.fr |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Andres Freund <andres(at)2ndquadrant(dot)com> writes:
>> How would postmaster know when to restart a worker that stopped?
>
> I had imagined we would assign some return codes special
> meaning. Currently 0 basically means "restart immediately", 1 means
> "crashed, wait for some time", everything else results in a postmaster
> restart. It seems we can just assign returncode 2 as "done", probably
> with some enum or such hiding the numbers.
In Erlang, the lib that cares about such things in called OTP, and that
proposes a model of supervisor that knows when to restart a worker. The
specs for the restart behaviour are:
Restart = permanent | transient | temporary
Restart defines when a terminated child process should be restarted.
- A permanent child process is always restarted.
- A temporary child process is never restarted (not even when the
supervisor's restart strategy is rest_for_one or one_for_all and a
sibling's death causes the temporary process to be terminated).
- A transient child process is restarted only if it terminates
abnormally, i.e. with another exit reason than normal, shutdown or
{shutdown,Term}.
Then about restart frequency, what they have is:
The supervisors have a built-in mechanism to limit the number of
restarts which can occur in a given time interval. This is
determined by the values of the two parameters MaxR and MaxT in the
start specification returned by the callback function [ ... ]
If more than MaxR number of restarts occur in the last MaxT seconds,
then the supervisor terminates all the child processes and then
itself.
You can read the whole thing here:
http://www.erlang.org/doc/design_principles/sup_princ.html#id71215
I think we should get some inspiration from them here.
Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
From | Date | Subject | |
---|---|---|---|
Next Message | Heikki Linnakangas | 2013-04-24 16:34:32 | Re: 9.3 Beta1 status report |
Previous Message | Heikki Linnakangas | 2013-04-24 16:16:21 | Re: missing time.h include in psql/command.c since the addition of \watch |