Re: [v9.3] Extra Daemons (Re: elegant and effective way for running jobs inside a database)

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, "David E(dot) Wheeler" <david(at)justatheory(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [v9.3] Extra Daemons (Re: elegant and effective way for running jobs inside a database)
Date: 2012-11-16 13:17:34
Message-ID: 20121116131733.GB4454@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Kohei KaiGai escribió:
> 2012/10/22 Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>:
> > Here's an updated version of this patch, which also works in
> > an EXEC_BACKEND environment. (I haven't tested this at all on Windows,
> > but I don't see anything that would create a portability problem there.)
> >
> I also tried to check the latest patch "briefly".
> Let me comment on several points randomly.

Thanks.

> Once bgworker process got crashed, postmaster tries to restart it about 60
> seconds later. My preference is this interval being configurable with initial
> registration parameter; that includes "never restart again" option.
> Probably, some extensions don't want to restart again on unexpected crash.

The main issue with specifying a crash-restart policy, I thought, was
that after a postmaster restart cycle there is no reasonable way to know
whether any bgworker should restart or not: had any particular bgworker
crashed? Remember that any backend crash, and some bgworker crashes,
will cause the postmaster to reinitialize the whole system. So I don't
see that it will be handled consistently; if we offer the option, module
developers might be led into thinking that a daemon that stops is never
to run again, but that's not going to be the case.

But I'm not really wedded to this behavior.

> Stop is one other significant event for bgworker, not only start-up.
> This interface has no way to specify when we want to stop bgworker.
> Probably, we can offer two options. Currently, bgworkers are simultaneously
> terminated with other regular backends. In addition, I want an option to make
> sure bgworker being terminated after all the regular backends exit.
> It is critical issue for me, because I try to implement parallel
> calculation server
> with bgworker, thus, it should not be terminated earlier than regular backend.

I can do that. I considered stop time as well, but couldn't think of
any case in which I would want the worker to persist beyond backends
exit, so I ended up not adding the option. But I think it's reasonably
easy to add.

> Regarding to process restart, this interface allows to specify the timing to
> start bgworker using BgWorkerStart_* label. On the other hand,
> bgworker_should_start_now() checks correspondence between pmState
> and the configured start-time using equal matching. It can make such a
> scenarios, for instance, a bgworker launched at PM_INIT state, then it got
> crashed. Author expected it shall be restarted, however, pmState was
> already progressed to PM_RUN, so nobody can restart this bgworker again.
> How about an idea to provide the start time with bitmap? It allows extensions
> to specify multiple candidates to launch bgworker.

No, I think you must be misreading the code. If a daemon specified
BgWorkerStart_PostmasterStart then it will start whenever pmState is
PM_INIT *or any later state*. This is why the switch has all those
"fall throughs".

> do_start_bgworker() initializes process state according to normal manner,
> then it invokes worker->bgw_main(). Once ERROR is raised inside of the
> main routine, the control is backed to the sigsetjmp() on do_start_bgworker(),
> then the bgworker is terminated.
> I'm not sure whether it is suitable for bgworker that tries to connect database,
> because it may raise an error everywhere, such as zero division and so on...
> I think it is helpful to provide a utility function in the core; that
> handles to abort
> transactions in progress, clean-up resources, and so on.

Yeah, I considered this too --- basically this is the question I posted
elsewhere, "what else do we need to provide for modules"? A sigsetjmp()
standard block or something is probably part of that.

> It ensures bgworker must be registered within shared_preload_libraries.
> Why not raise an error if someone try to register bgworker later?

Hm, sure, we can do that I guess.

> How about to move bgw_name field into tail of BackgroundWorker structure?
> It makes simplifies the logic in RegisterBackgroundWorker(), if it is
> located as:
> typedef struct BackgroundWorker
> {
> int bgw_flags;
> BgWorkerStartTime bgw_start_time;
> bgworker_main_type bgw_main;
> void *bgw_main_arg;
> bgworker_sighdlr_type bgw_sighup;
> bgworker_sighdlr_type bgw_sigterm;
> char bgw_name[1]; <== (*)
> } BackgroundWorker;

Makes sense.

> StartOneBackgroundWorker always scan the BackgroundWorkerList from
> the head. Isn't it available to save the current position at static variable?
> If someone tries to manage thousand of bgworkers, it makes a busy loop. :(

We could try that.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Markus Wanner 2012-11-16 13:46:39 Re: logical changeset generation v3 - comparison to Postgres-R change set format
Previous Message Dimitri Fontaine 2012-11-16 13:13:43 Re: Materialized views WIP patch