Re: POC: postgres_fdw insert batching

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Ashutosh Bapat <ashutosh(dot)bapat(at)2ndquadrant(dot)com>, Amit Langote <amitlangote09(at)gmail(dot)com>
Subject: Re: POC: postgres_fdw insert batching
Date: 2020-06-29 10:52:15
Message-ID: CAExHW5s2GEi9tp522Z1sn5D=eLxU-+7K+3h9CcyNCVh=hF7ShQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Jun 28, 2020 at 8:40 PM Tomas Vondra
<tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:

>
> FDW batching: 4584 ms
>
> So, rather nice improvement, I'd say ...

Very nice.

>
> Before I spend more time hacking on this, I have a couple open questions
> about the design, restrictions etc.
>
>
> 1) Extend the FDW API?
>
> In the patch, the batching is simply "injected" into the existing insert
> API method, i.e. ExecForeignInsert et al. I wonder if it'd be better to
> extend the API with a "batched" version of the method, so that we can
> easily determine whether the FDW supports batching or not - it would
> require changes in the callers, though. OTOH it might be useful for
> COPY, where we could do something similar to multi_insert (COPY already
> benefits from this patch, but it does not use the batching built-into
> COPY).

Amit Langote has pointed out a related patch being discussed on hackers at [1].

That patch introduces a new API. But if we can do it without
introducing a new API that will be good. FDWs which can support
batching can just modify their code and don't have to implement and
manage a new API. We already have a handful of those APIs.

>
> 2) What about the insert results?
>
> I'm not sure what to do about "result" status for the inserted rows. We
> only really "stash" the rows into a buffer, so we don't know if it will
> succeed or not. The patch simply assumes it will succeed, but that's
> clearly wrong, and it may result in reporting a wrong number or rows.

I didn't get this. We are executing an INSERT on the foreign server,
so we get the number of rows INSERTed from that server. We should just
add those up across batches. If there's a failure, it would abort the
transaction, local as well as remote.

>
> The patch also disables the batching when the insert has a RETURNING
> clause, because there's just a single slot (for the currently inserted
> row). I suppose a "batching" method would take an array of slots.
>

It will be a rare case when a bulk load also has a RETURNING clause.
So, we can leave with this restriction. We should try to choose a
design which allows that restriction to be lifted in the future. But I
doubt that restriction will be a serious one.

>
> 3) What about the other DML operations (DELETE/UPDATE)?
>
> The other DML operations could probably benefit from the batching too.
> INSERT was good enough for a PoC, but having batching only for INSERT
> seems somewhat asmymetric. DELETE/UPDATE seem more complicated because
> of quals, but likely doable.

Bulk INSERTs are more common in a sharded environment because of data
load in say OLAP systems. Bulk update/delete are rare, although not
that rare. So if an approach just supports bulk insert and not bulk
UPDATE/DELETE that will address a large number of usecases IMO. But if
we can make everything work together that would be good as well.

In your patch, I see that an INSERT statement with batch is
constructed as INSERT INTO ... VALUES (...), (...) as many values as
the batch size. That won't work as is for UPDATE/DELETE since we can't
pass multiple pairs of ctids and columns to be updated for each ctid
in one statement. Maybe we could build as many UPDATE/DELETE
statements as the size of a batch, but that would be ugly. What we
need is a feature like a batch prepared statement in libpq similar to
what JDBC supports
((https://mkyong.com/jdbc/jdbc-preparedstatement-example-batch-update/)
This will allow a single prepared statement to be executed with a
batch of parameters, each batch corresponding to one foreign DML
statement.

>
>
> 3) Should we do batching for COPY insteads?
>
> While looking at multi_insert, I've realized it's mostly exactly what
> the new "batching insert" API function would need to be. But it's only
> really used in COPY, so I wonder if we should just abandon the idea of
> batching INSERTs and do batching COPY for FDW tables.

I think this won't support RETURNING as well. But if we could somehow
use copy protocol to send the data to the foreign server and yet treat
it as INSERT, that might work. I think we have find out which performs
better COPY or batch INSERT.

>
> For cases that can replace INSERT with COPY this would be enough, but
> unfortunately it does nothing for DELETE/UPDATE so I'm hesitant to do
> this :-(

Agreed, if we want to support bulk UPDATE/DELETE as well.

>
>
> 4) Expected consistency?
>
> I'm not entirely sure what are the consistency expectations for FDWs.
> Currently the FDW nodes pointing to the same server share a connection,
> so the inserted rows might be visible to other nodes. But if we only
> stash the rows in a local buffer for a while, that's no longer true. So
> maybe this breaks the consistency expectations?
>
> But maybe that's OK - I'm not sure how the prepared statements/cursors
> affect this. I can imagine restricting the batching only to plans where
> this is not an issue (single FDW node or something), but it seems rather
> fragile and undesirable.

I think that area is grey. Depending upon where the cursor is
positioned when a DML node executes a query, the data fetched from
cursor may or may not see the effect of DML. The cursor position is
based on the batch size so we already have problems in this area I
think. Assuming that the DML and SELECT are independent this will
work. So, the consistency problems exists, it will just be modulated
by batching DML. I doubt that's related to this feature exclusively
and should be solved independent of this feature.

>
> I was thinking about adding a GUC to enable/disable the batching at some
> level (global, server, table, ...) but it seems like a bad match because
> the consistency expectations likely depend on a query. There should be a
> GUC to set the batch size, though (it's hardcoded to 100 for now).
>

Similar to fetch_size, it should foreign server, table level setting, IMO.

[1] https://www.postgresql.org/message-id/flat/3d0909dc-3691-a576-208a-90986e55489f%40postgrespro.ru

--
Best Wishes,
Ashutosh Bapat

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2020-06-29 10:54:13 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Previous Message Quan Zongliang 2020-06-29 10:45:47 Re: bugfix: invalid bit/varbit input causes the log file to be unreadable