Re: POC: postgres_fdw insert batching

From: Craig Ringer <craig(dot)ringer(at)enterprisedb(dot)com>
To: "tsunakawa(dot)takay(at)fujitsu(dot)com" <tsunakawa(dot)takay(at)fujitsu(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, "Andrey V(dot) Lepikhov" <a(dot)lepikhov(at)postgrespro(dot)ru>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: POC: postgres_fdw insert batching
Date: 2020-11-25 02:10:44
Message-ID: CAGRY4nzSZTn-EbvNJq+ZRVKrMv0pHGqb6xfi-SAiM5+LrX0Ebw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 8, 2020 at 10:40 AM tsunakawa(dot)takay(at)fujitsu(dot)com <
tsunakawa(dot)takay(at)fujitsu(dot)com> wrote:

>
> Thank you for picking up this. I'm interested in this topic, too. (As an
> aside, we'd like to submit a bulk insert patch for ECPG in the near future.)
>
> As others referred, Andrey-san's fast COPY to foreign partitions is also
> promising. But I think your bulk INSERT is a separate feature and offers
> COPY cannot do -- data transformation during loading with INSERT SELECT and
> CREATE TABLE AS SELECT.
>
> Is there anything that makes you worry and stops development? Could I
> give it a try to implement this (I'm not sure I can, sorry. I'm worried if
> we can change the executor's call chain easily.)
>
>
I suggest that when developing this, you keep in mind the ongoing work on
the libpq pipelining/batching enhancements, and also the way many
interfaces to foreign data sources support asynchronous, concurrent
operations.

Best results with postgres_fdw insert batching would be achieved if it can
also send its batches as asynchronous queries and only block when it's
required to report on the results of the work. This will also be true of
any other FDW where the backing remote interface can support asynchronous
concurrent or pipelined operation.

I'd argue it's pretty much vital for decent performance when talking to a
cloud database from an on-prem server for example, or any other time that
round-trip-time reduction is important.

The most important characteristic of an FDW API to permit this would be
decoupling of request and response into separate non-blocking calls that
don't have to occur in ordered pairs. Instead of "insert_foo(foo) ->
insert_result", have "queue_insert_foo(foo) -> future_result",
"get_result_if_available(future_result) -> maybe result" and
"get_result_blocking(future_result) -> result". Permit multiple
queue_insert_foo(...)s without a/b interleaving with result fetches being
required.

Ideally it'd be able to accumulate small batches of inserts locally and
send a batch to the remote end once it's accumulated enough. But instead of
blocking waiting for the result, return control to the executor after
sending, without forcing a socket flush (which might block) and without
waiting to learn what the outcome was. Allow new batches to be accumulated
and sent before the results of the first batch are received, so long as
it's within the same executor node so we don't make any unfortunate
mistakes with mixing things up in compound statements or functions etc.
Only report outcomes like rowcounts lazily when results are received, or
when required to do so.

If now we have

REQUEST -> [block] -> RESULT
~~ round trip delay ~~
REQUEST -> [block] -> RESULT
~~ round trip delay ~~
REQUEST -> [block] -> RESULT
~~ round trip delay ~~
REQUEST -> [block] -> RESULT

and batching would give us

{ REQUEST, REQUEST} -> [block] -> { RESULT, RESULT }
~~ round trip delay ~~
{ REQUEST, REQUEST} -> [block] -> { RESULT, RESULT }

consider if room can be left in the batching API to permit:

{ REQUEST, REQUEST} -> [nonblocking send...]
{ REQUEST, REQUEST} -> [nonblocking send...]
~~ round trip delay ~~
[....] -> RESULT, RESULT
[....] -> RESULT, RESULT

... where we only actually block at the point where the result is required
as input into the next node.

Honestly I don't know the executor structure well enough to say if this is
even remotely feasible right now. Maybe Andres may be able to comment. But
please keep it in mind if you're thinking of making FDW API changes.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Li Japin 2020-11-25 02:18:28 Re: Terminate the idle sessions
Previous Message Tom Lane 2020-11-25 01:59:14 Re: Keep elog(ERROR) and ereport(ERROR) calls in the cold path