Re: FDW INSERT batching can change behavior

From: Tomas Vondra <tomas(at)vondra(dot)me>
To: git(at)jasonk(dot)me, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: FDW INSERT batching can change behavior
Date: 2024-08-14 11:31:02
Message-ID: 1ee2d452-45c8-4512-bc4b-773b0ec5d0a0@vondra.me
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 8/13/24 20:31, Tomas Vondra wrote:
> Hi,
>
> I took a closer look at this, planning to way to fix this, but I think
> it's actually a bit worse than reported - both in impact and ways how to
> fix that.
>
> The problem is it's not really specific to DEFAULT values. The exact
> same issue exists whenever the insert uses the expressions directly.
> That is, if you do this:
>
>
> insert into grem1 (a) values (counter()), (counter()),
> (counter()), (counter()),
> (counter());
>
> it will misbehave in exactly the same way as with the default values. Of
> course, this also means that my original idea to disable batching if the
> foreign table has (volatile) expression in the DEFAULT value won't fly.
>
> This can happen whenever the to-be-inserted rows have any expressions.
> But those expressions are evaluated *outside* ModifyTable - in the nodes
> that produce the rows. In the above example it's ValueScan. But it could
> be any other node. For example:
>
> insert into grem1 (a) select counter() from generate_series(1,5);
>
> does that in a subquery. But AFAICS it could be any other node.
>
> Ideally we'd simply set batch_size=1 for those cases, but at this point
> I have no idea how to check this from ModifyTable :-(
>
> In retrospect the issue is pretty obvious, yet I haven't thought about
> this while working on the batching. This is embarrassing.
>

I've been thinking about this a bit more, and I'm not really sure using
counter() as a default value can ever be "correct". The problem is it's
inherently broken with concurrency - even with no batching, it'll fail
if two of these inserts run at the same time. The batching only makes
that more obvious / easier to hit, but it's not really the root cause.

regards

--
Tomas Vondra

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Dmytro Astapov 2024-08-14 14:09:17 Using current_user as an argument of pl/pgsql function affects collation of other arguments
Previous Message Yeddula, Madhusudhan reddy [CONTINGENT WORKER] 2024-08-14 08:52:08 RE: BUG #18569: Memory leak in Postgres Enterprise server