Re: pg_background (and more parallelism infrastructure patches)

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Andres Freund <andres(at)2ndquadrant(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_background (and more parallelism infrastructure patches)
Date: 2014-07-29 16:51:18
Message-ID: CA+TgmoZP2JtKdVfu0RXqoL+vo-wsjv2OxwTU83QmP_GqCyukVA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jul 28, 2014 at 1:50 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> Don't get me wrong, I don't object to anything in here. It's just that
> the bigger picture can help giving sensible feedback.

Right. I did not get you wrong. :-)

The reason I'm making a point of it is that, if somebody wants to
object to the way those facilities are designed, it'd be good to get
that out of the way now rather than waiting until 2 or 3 patch sets
from now and then saying "Uh, could you guys go back and rework all
that stuff?". I'm not going to complain too loudly now if somebody
wants something in there done in a different way, but it's easier to
do that now while there's only pg_background sitting on top of it.

> What I'm thinking of is providing an actual API for the writes instead
> of hooking into the socket API in a couple places. I.e. have something
> like
>
> typedef struct DestIO DestIO;
>
> struct DestIO
> {
> void (*flush)(struct DestIO *io);
> int (*putbytes)(struct DestIO *io, const char *s, size_t len);
> int (*getbytes)(struct DestIO *io, const char *s, size_t len);
> ...
> }
>
> and do everything through it. I haven't thought much about the specific
> API we want, but abstracting the communication properly instead of
> adding hooks here and there is imo much more likely to succeed in the
> long term.

This sounds suspiciously like the DestReceiver thing we've already
got, except that the DestReceiver only applies to tuple results, not
errors and notices and so on. I'm not totally unamenable to a bigger
refactoring here, but right now it looks to me like a solution in
search of a problem. The hooks are simple and seem to work fine; I
don't want to add notation for its own sake.

>> > Also, you seem to have only touched receiving from the client, and not
>> > sending back to the subprocess. Is that actually sufficient? I'd expect
>> > that for this facility to be fully useful it'd have to be two way
>> > communication. But perhaps I'm overestimating what it could be used for.
>>
>> Well, the basic shm_mq infrastructure can be used to send any kind of
>> messages you want between any pair of processes that care to establish
>> them. But in general I expect that data is going to flow mostly in
>> one direction - the user backend will launch workers and give them an
>> initial set of instructions, and then results will stream back from
>> the workers to the user backend. Other messaging topologies are
>> certainly possible, and probably useful for something, but I don't
>> really know exactly what those things will be yet, and I'm not sure
>> the FEBE protocol will be the right tool for the job anyway.
>
> It's imo not particularly unreasonable to e.g. COPY to/from a bgworker. Which
> would require the ability to both read/write from the other side.

Well, that should work fine if the background worker and user backend
generate the CopyData messages via some bespoke code rather than
expecting to be able to jump into copy.c and have everything work. If
you want that to work, why? It doesn't make much sense for
pg_background, because I don't think it would be sensible for SELECT
pg_background_result(...) to return CopyInResponse or CopyOutResponse,
and even if it were sensible it doesn't seem useful. And I can't
think of any other application off-hand, either.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2014-07-29 16:56:28 Re: Making joins involving ctid work for the benefit of UPSERT
Previous Message Tom Lane 2014-07-29 16:38:16 Re: Re: [GENERAL] pg_dump behaves differently for different archive formats