From: | Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> |
---|---|
To: | Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> |
Cc: | Simon Riggs <simon(at)2ndquadrant(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, "David E(dot) Wheeler" <david(at)justatheory(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [v9.3] Extra Daemons (Re: elegant and effective way for running jobs inside a database) |
Date: | 2012-11-16 13:17:34 |
Message-ID: | 20121116131733.GB4454@alvh.no-ip.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Kohei KaiGai escribió:
> 2012/10/22 Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>:
> > Here's an updated version of this patch, which also works in
> > an EXEC_BACKEND environment. (I haven't tested this at all on Windows,
> > but I don't see anything that would create a portability problem there.)
> >
> I also tried to check the latest patch "briefly".
> Let me comment on several points randomly.
Thanks.
> Once bgworker process got crashed, postmaster tries to restart it about 60
> seconds later. My preference is this interval being configurable with initial
> registration parameter; that includes "never restart again" option.
> Probably, some extensions don't want to restart again on unexpected crash.
The main issue with specifying a crash-restart policy, I thought, was
that after a postmaster restart cycle there is no reasonable way to know
whether any bgworker should restart or not: had any particular bgworker
crashed? Remember that any backend crash, and some bgworker crashes,
will cause the postmaster to reinitialize the whole system. So I don't
see that it will be handled consistently; if we offer the option, module
developers might be led into thinking that a daemon that stops is never
to run again, but that's not going to be the case.
But I'm not really wedded to this behavior.
> Stop is one other significant event for bgworker, not only start-up.
> This interface has no way to specify when we want to stop bgworker.
> Probably, we can offer two options. Currently, bgworkers are simultaneously
> terminated with other regular backends. In addition, I want an option to make
> sure bgworker being terminated after all the regular backends exit.
> It is critical issue for me, because I try to implement parallel
> calculation server
> with bgworker, thus, it should not be terminated earlier than regular backend.
I can do that. I considered stop time as well, but couldn't think of
any case in which I would want the worker to persist beyond backends
exit, so I ended up not adding the option. But I think it's reasonably
easy to add.
> Regarding to process restart, this interface allows to specify the timing to
> start bgworker using BgWorkerStart_* label. On the other hand,
> bgworker_should_start_now() checks correspondence between pmState
> and the configured start-time using equal matching. It can make such a
> scenarios, for instance, a bgworker launched at PM_INIT state, then it got
> crashed. Author expected it shall be restarted, however, pmState was
> already progressed to PM_RUN, so nobody can restart this bgworker again.
> How about an idea to provide the start time with bitmap? It allows extensions
> to specify multiple candidates to launch bgworker.
No, I think you must be misreading the code. If a daemon specified
BgWorkerStart_PostmasterStart then it will start whenever pmState is
PM_INIT *or any later state*. This is why the switch has all those
"fall throughs".
> do_start_bgworker() initializes process state according to normal manner,
> then it invokes worker->bgw_main(). Once ERROR is raised inside of the
> main routine, the control is backed to the sigsetjmp() on do_start_bgworker(),
> then the bgworker is terminated.
> I'm not sure whether it is suitable for bgworker that tries to connect database,
> because it may raise an error everywhere, such as zero division and so on...
> I think it is helpful to provide a utility function in the core; that
> handles to abort
> transactions in progress, clean-up resources, and so on.
Yeah, I considered this too --- basically this is the question I posted
elsewhere, "what else do we need to provide for modules"? A sigsetjmp()
standard block or something is probably part of that.
> It ensures bgworker must be registered within shared_preload_libraries.
> Why not raise an error if someone try to register bgworker later?
Hm, sure, we can do that I guess.
> How about to move bgw_name field into tail of BackgroundWorker structure?
> It makes simplifies the logic in RegisterBackgroundWorker(), if it is
> located as:
> typedef struct BackgroundWorker
> {
> int bgw_flags;
> BgWorkerStartTime bgw_start_time;
> bgworker_main_type bgw_main;
> void *bgw_main_arg;
> bgworker_sighdlr_type bgw_sighup;
> bgworker_sighdlr_type bgw_sigterm;
> char bgw_name[1]; <== (*)
> } BackgroundWorker;
Makes sense.
> StartOneBackgroundWorker always scan the BackgroundWorkerList from
> the head. Isn't it available to save the current position at static variable?
> If someone tries to manage thousand of bgworkers, it makes a busy loop. :(
We could try that.
--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Markus Wanner | 2012-11-16 13:46:39 | Re: logical changeset generation v3 - comparison to Postgres-R change set format |
Previous Message | Dimitri Fontaine | 2012-11-16 13:13:43 | Re: Materialized views WIP patch |