Re: [v9.3] Extra Daemons (Re: elegant and effective way for running jobs inside a database)

From: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
To: Amit kapila <amit(dot)kapila(at)huawei(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, "\"'Boszormenyi Zoltan'\"" <zb(at)cybertec(dot)at>, 'Jaime Casanova' <jaime(at)2ndquadrant(dot)com>, "\"'Kohei KaiGai'\"" <kaigai(at)kaigai(dot)gr(dot)jp>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, 'David E(dot) Wheeler' <david(at)justatheory(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Hans-Jürgen Schönig <hs(at)cybertec(dot)at>
Subject: Re: [v9.3] Extra Daemons (Re: elegant and effective way for running jobs inside a database)
Date: 2012-09-23 18:54:12
Message-ID: 1348425468-sup-3678@alvh.no-ip.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Excerpts from Amit kapila's message of sáb sep 22 01:14:40 -0300 2012:
> On Friday, September 21, 2012 6:50 PM Alvaro Herrera wrote:
> Excerpts from Amit Kapila's message of vie sep 21 02:26:49 -0300 2012:
> > On Thursday, September 20, 2012 7:13 PM Alvaro Herrera wrote:
>
> > > > Well, there is a difficulty here which is that the number of processes
> > >> connected to databases must be configured during postmaster start
> > >> (because it determines the size of certain shared memory structs). So
> > >> you cannot just spawn more tasks if all max_worker_tasks are busy.
> > >> (This is a problem only for those workers that want to be connected as
> > >> backends. Those that want libpq connections do not need this and are
> > >> easier to handle.)
> >
>
> >> If not above then where there is a need of dynamic worker tasks as mentioned by Simon?
>
> > Well, I think there are many uses for dynamic workers, or short-lived
> > workers (start, do one thing, stop and not be restarted).
>
> > In my design, a worker is always restarted if it stops; otherwise there
> > is no principled way to know whether it should be running or not (after
> > a crash, should we restart a registered worker? We don't know whether
> > it stopped before the crash.) So it seems to me that at least for this
> > first shot we should consider workers as processes that are going to be
> > always running as long as postmaster is alive. On a crash, if they have
> > a backend connection, they are stopped and then restarted.
>
> a. Is there a chance that it would have made shared memory inconsitent after crash like by having lock on some structure and crash before releasing it?
> If such is case, do we need reinitialize the shared memory as well with worker restart?

Any worker that requires access to shared memory will have to be stopped
and restarted on a crash (of any other postmaster child process).
Conversely, if a worker requires shmem access, it will have to cause the
whole system to be stopped/restarted if it crashes in some ugly way.
Same as any current process that's connected to shared memory, I think.

So, to answer your question, yes. We need to take the safe route and
consider that a crashed process might have corrupted shmem. (But if it
dies cleanly, then there is no need for this.)

> b. do these worker tasks be able to take any new jobs, or whatever
> they are started with they will do only those jobs?

Not sure I understand this question. If a worker connects to a
database, it will stay connected to that database until it dies;
changing DBs is not allowed. If you want a worker that connects to
database A, does stuff there, and then connects to database B, it could
connect to A, do its deed, then set up database=B in shared memory and
stop, which will cause postmaster to restart it; next time it starts, it
reads shmem and knows to connect to the other DB.

My code has the ability to connect to no particular database -- what
autovac launcher does (this lets it read shared catalogs). So you could
do useful things like have the first invocation of your worker connect
to that on the first invocation and read pg_database to determine what
DB to connect next, then terminate.

You could also have worker groups commanded by one process: one queen
bee, one or more worker bees. The queen determines what to do, sets
tasklist info in shmem, signals worker bees. While the tasklist is
empty, workers would sleep.

As you can see there are many things that can be done with this.

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2012-09-23 19:30:35 Re: trivial typo in src/tools/RELEASE_CHANGES
Previous Message Jan Urbański 2012-09-23 17:21:53 trivial typo in src/tools/RELEASE_CHANGES