Re: Background worker assistance & review

From: Keith Fiske <keith(at)omniti(dot)com>
To: Craig Ringer <craig(at)2ndquadrant(dot)com>
Cc: PGSQL Mailing List <pgsql-general(at)postgresql(dot)org>
Subject: Re: Background worker assistance & review
Date: 2015-04-10 17:00:18
Message-ID: CAG1_KcDcWdA+AFsZO5po=PBhVPE-_adf6TkpHq6W46KwC=+17w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Thu, Apr 9, 2015 at 11:56 PM, Craig Ringer <craig(at)2ndquadrant(dot)com> wrote:

>
>
> On 9 April 2015 at 05:35, Keith Fiske <keith(at)omniti(dot)com> wrote:
>
>> I'm working on a background worker (BGW) for my pg_partman extension.
>> I've gotten the basics of it working for my first round, but there's two
>> features I'm missing that I'd like to add before release:
>>
>> 1) Only allow one instance of this BGW to run
>>
>
> Load your extension in shared_preload_libraries, so that _PG_init runs in
> the postmaster. Register a static background worker then.
>
> If you need one worker per database (because it needs to access the DB)
> this won't work for you, though. What we do in BDR is have a single static
> background worker that's launched by the postmaster, which then launches
> and terminates per-database workers that do the "real work".
>
> Because of a limitation in the bgworker API in releases 9.4 and older, the
> static worker has to connect to a database if it wants to access shared
> catalogs like pg_database. This limitation has been lifted in 9.5 though,
> along with the need to use the database name instead of its oid to connect
> (which left bgworkers unable to handle RENAME DATABASE).
>
> (We still really need a hook on CREATE DATABASE too)
>
> 2) Create a bgw_terminate_partman() function to stop it more intuitively
>> than doing a pg_cancel_backend() on the PID
>>
>
> If you want it to be able to be started/stopped dynamically, you should
> probably use RequestAddinShmemSpace to allocate a small shared memory
> block. Use that to register the PGPROC for the current worker when the
> worker starts, and add a boolean field you can use to ask it to terminate
> its self. You'll also need a LWLock to protect access to the segment, so
> you don't have races between a worker starting and the user asking to
> cancel it, etc.
>
> Unfortunately the BackgroundWorkerHandle struct is opaque, so you cannot
> store it in shared memory when it's returned by
> RegisterDynamicBackgroundWorker() and use it to later check the worker's
> status or ask it to exit. You have to use regular backend manipulation
> functions and PGPROC instead.
>
> Personally, I suggest that you leave the worker as a static worker, and
> leave it always running when the extension is active. If it isn't doing
> anything, have it sleep on its latch, then set its latch from other
> processes when something interesting happens. (You can put the process
> latch from PGPROC into your shmem segment so you can set it from elsewhere,
> or allocate a new latch).
>
> This is my first venture into writing C code for postgres, so I'm not
>> familiar with a lot of the internals yet. I read
>> http://www.postgresql.org/docs/9.4/static/bgworker.html and I see it
>> mentioning how you can check the status of a BGW launched dynamically and
>> the function to terminate one, but I'm not clear how how you can get the
>> information on a currently running BGW to do these things.
>>
>
> You can't. It's a pretty significant limitation in the current API.
> There's no way to enumerate bgworkers via the bgworker API, only via PGPROC.
>
>
>> I used the worker_spi example for a lot of this, so if there's any
>> additional guidance for a better way to do what I've done, I'd appreciate
>> it. All I really have it doing now is calling the run_maintenance()
>> function at a defined interval and don't need it doing more than that yet.
>> <http://www.keithf4.com>
>>
>
> The BDR project has an extension with much more in-depth use of background
> workers, but it's probably *too* complicated. We have a static bgworker
> that launches and terminates dynamic bgworkers (per-database) that in turn
> launch and terminate more dynamic background workers (per-connection to
> peer databases).
>
> If you're interested, all the code is mirrored on github:
>
> https://github.com/2ndquadrant/bdr/tree/bdr-plugin/next
>
> and the relevant parts are:
>
> https://github.com/2ndQuadrant/bdr/blob/bdr-plugin/next/bdr.c#L640
> https://github.com/2ndQuadrant/bdr/blob/bdr-plugin/next/bdr_perdb.c
> https://github.com/2ndQuadrant/bdr/blob/bdr-plugin/next/bdr_supervisor.c
> https://github.com/2ndQuadrant/bdr/blob/bdr-plugin/next/bdr_shmem.c
> https://github.com/2ndQuadrant/bdr/blob/bdr-plugin/next/bdr_apply.c#L2401
> https://github.com/2ndQuadrant/bdr/blob/bdr-plugin/next/bdr.h
>
> ... but there's a *lot* of code there.
>
> --
> Craig Ringer http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services
>

Craig,

Thanks for the response! Definitely cleared up a lot of questions I had
regarding how to interact with currently running BGWs. Glad to know I can
at least stop banging my head against the desk about that. I've still got a
lot to learn as far as how to interact with shared memory, but now that I
know that's the path I have to go down, I'm fine with that.

My current plan now after your response this this:

- Statically launch master BGW with shared_preload_libraries
- Use dynamically launched BGW for each database that pg_partman will run
on in the cluster. My previous idea of restricting it to one BGW would
likely have stopped it from ever working on more than one database in a
cluster.
- Will see if I can create a function that polls the cluster for currently
existing databases that actually have pg_partman installed. This should
eliminate the need for a GUC naming the databases to run for. Should allow
handling if a database is renamed as well. This way, as soon as the
extension is created on a database, it should hopefully "just work" and
start managing it.

9.4 is my targeted release to support for a while, so I'll just have to
deal with the shortcomings you mentioned. Does the above sound like it
could work then?

--
Keith Fiske
Database Administrator
OmniTI Computer Consulting, Inc.
http://www.keithf4.com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Alvaro Herrera 2015-04-10 17:15:13 Re: Background worker assistance & review
Previous Message David G. Johnston 2015-04-10 16:57:26 Re: Limiting user from changing its own attributes