Re: Asynchronous Append on postgres_fdw nodes.

From: Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, "movead(dot)li" <movead(dot)li(at)highgo(dot)ca>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Asynchronous Append on postgres_fdw nodes.
Date: 2020-12-19 09:20:52
Message-ID: CAPmGK14UrEDSspoXjaB46oV2QGJk+6kOD91YQ6QmShyiCsg1jg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Dec 14, 2020 at 5:56 PM Kyotaro Horiguchi
<horikyota(dot)ntt(at)gmail(dot)com> wrote:
> At Sat, 12 Dec 2020 19:06:51 +0900, Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com> wrote in
> > On Fri, Nov 20, 2020 at 8:16 PM Kyotaro Horiguchi
> > <horikyota(dot)ntt(at)gmail(dot)com> wrote:

> > > + /* wait or poll async events */
> > > + if (!bms_is_empty(node->as_asyncpending))
> > > + {
> > > + Assert(!node->as_syncdone);
> > > + Assert(bms_is_empty(node->as_needrequest));
> > > + ExecAppendAsyncEventWait(node);
> > >
> > > You moved the function to wait for events from execAsync to
> > > nodeAppend. The former is a generic module that can be used from any
> > > kind of executor nodes, but the latter is specialized for nodeAppend.
> > > In other words, the abstraction level is lowered here. What is the
> > > reason for the change?
> >
> > The reason is just because that function is only called from
> > ExecAppend(). I put some functions only called from nodeAppend.c in
> > execAsync.c, though.
>
> (I think) You told me that you preferred the genericity of the
> original interface, but you're doing the opposite. If you think we
> can move such a generic feature to a part of Append node, all other
> features can be move the same way. I guess there's a reason you want
> only the this specific feature out of all of them be Append-spcific
> and I want to know that.

The reason is that I’m thinking to add a small feature for
multiplexing Append subplans, not a general feature for async
execution as discussed in [1], because this would be an interim
solution until the executor rewrite is done.

> > I think that when an async-aware node gets a tuple from an
> > async-capable node, they should use ExecAsyncRequest() /
> > ExecAyncHogeResponse() rather than ExecProcNode() [1]. I modified the
> > patch so that ExecAppendAsyncResponse() is called from Append, but to
> > support bubbling up the plan tree discussed in [2], I think it should
> > be called from ForeignScans (the sides of async-capable nodes). Am I
> > right? Anyway, I’ll rename ExecAppendAyncResponse() to the one
> > proposed in Robert’s original patch.
>
> Even though I understand the concept but to make work it we need to
> remember the parent *async* node somewhere. In my faint memory the
> very early patch did something like that.
>
> So I think just providing ExecAsyncResponse() doesn't make it
> true. But if we make it true, it would be something like
> partially-reversed steps from what the current Exec*()s do for some of
> the existing nodes and further code is required for some other nodes
> like WindowFunction. Bubbling up works only in very simple cases where
> a returned tuple is thrown up to further parent as-is or at least when
> the node convers a tuple into another shape. If an async-receiver node
> wants to process multiple tuples from a child or from multiple
> children, it is no longer be just a bubbling up.

I explained the meaning of “bubbling up the plan tree” in a previous
email I sent a moment ago.

> And.. I think the reason I feel uneasy for the patch may be that the
> patch uses the interface names in somewhat different context.
> Origianlly the fraemework resides in-between executor nodes, not on a
> node of either side. ExecAsyncNotify() notifies the requestee about an
> event and ExecAsyncResonse() notifies the requestor about a new
> tuple. I don't feel strangeness in this usage. But this patch feels to
> me using the same names in different (and somewhat wrong) context.

Sorry, this is a WIP patch. Will fix.

> > > + /* Perform the actual callback. */
> > > + ExecAsyncNotify(areq);
> > >
> > > Mmm. The usage of the function (or its name) looks completely reverse
> > > to me.

> > As mentioned in a previous email, this is an FDW callback routine
> > revived from Robert’s patch. I think the naming is reasonable,
> > because the callback routine notifies the FDW of readiness of a file
> > descriptor. And actually, the callback routine tells the core whether
> > the corresponding ForeignScan node is ready for a new request or not,
> > by setting the callback_pending flag accordingly.
>
> Hmm. Agreed. The word "callback" is also used there [3]... I
> remember and it seems reasonable that the core calls AsyncNotify() on
> FDW and the FDW calls ExecForeignScan as a response to it and notify
> back to core of that using ExecAsyncRequestDone(). But the patch here
> feels a little strange, or uneasy, to me.

I’m not sure what I should do to improve the patch. Could you
elaborate a bit more on this part?

> > > postgres_fdw.c
> > >
> > > > postgresIterateForeignScan(ForeignScanState *node)
> > > > {
> > > > PgFdwScanState *fsstate = (PgFdwScanState *) node->fdw_state;
> > > > TupleTableSlot *slot = node->ss.ss_ScanTupleSlot;
> > > >
> > > > /*
> > > > * If this is the first call after Begin or ReScan, we need to create the
> > > > * cursor on the remote side.
> > > > */
> > > > if (!fsstate->cursor_exists)
> > > > create_cursor(node);

> > > That being said, I think we should unify the
> > > code except the differences between async and sync. For example, if
> > > the fetch_more_data_begin() needs to be called only for async
> > > fetching, the cursor should be created before calling the function, in
> > > the code path common with sync fetching.
> >
> > I think that that would make the code easier to understand, but I’m
> > not 100% sure we really need to do so.
>
> And I believe that we don't tolerate even the slightest performance
> degradation.

In the case of async execution, the cursor would have already been
created before we get here as mentioned by you, so we would just skip
create_cursor() in that case. I don’t think that that would degrade
performance noticeably. Am I wrong?

Thanks again!

Best regards,
Etsuro Fujita

[1] https://www.postgresql.org/message-id/CA%2BTgmobx8su_bYtAa3DgrqB%2BR7xZG6kHRj0ccMUUshKAQVftww%40mail.gmail.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alastair Turner 2020-12-19 11:45:08 Re: Proposed patch for key managment
Previous Message Etsuro Fujita 2020-12-19 08:55:22 Re: Asynchronous Append on postgres_fdw nodes.