Re: FDW INSERT batching can change behavior

From: Tomas Vondra <tomas(at)vondra(dot)me>
To: git(at)jasonk(dot)me, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: FDW INSERT batching can change behavior
Date: 2024-08-20 21:47:30
Message-ID: 148fcd22-d9d8-4363-896f-e5a5cc233023@vondra.me
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 8/14/24 13:31, Tomas Vondra wrote:
> On 8/13/24 20:31, Tomas Vondra wrote:
>> Hi,
>>
>> I took a closer look at this, planning to way to fix this, but I think
>> it's actually a bit worse than reported - both in impact and ways how to
>> fix that.
>>
>> The problem is it's not really specific to DEFAULT values. The exact
>> same issue exists whenever the insert uses the expressions directly.
>> That is, if you do this:
>>
>>
>> insert into grem1 (a) values (counter()), (counter()),
>> (counter()), (counter()),
>> (counter());
>>
>> it will misbehave in exactly the same way as with the default values. Of
>> course, this also means that my original idea to disable batching if the
>> foreign table has (volatile) expression in the DEFAULT value won't fly.
>>
>> This can happen whenever the to-be-inserted rows have any expressions.
>> But those expressions are evaluated *outside* ModifyTable - in the nodes
>> that produce the rows. In the above example it's ValueScan. But it could
>> be any other node. For example:
>>
>> insert into grem1 (a) select counter() from generate_series(1,5);
>>
>> does that in a subquery. But AFAICS it could be any other node.
>>
>> Ideally we'd simply set batch_size=1 for those cases, but at this point
>> I have no idea how to check this from ModifyTable :-(
>>
>> In retrospect the issue is pretty obvious, yet I haven't thought about
>> this while working on the batching. This is embarrassing.
>>
>
> I've been thinking about this a bit more, and I'm not really sure using
> counter() as a default value can ever be "correct". The problem is it's
> inherently broken with concurrency - even with no batching, it'll fail
> if two of these inserts run at the same time. The batching only makes
> that more obvious / easier to hit, but it's not really the root cause.
>

FWIW I've brought this up on pgsql-hackers, just to get more opinions if
this should be considered a bug, and I think the conclusion is from that
thread is "not a bug".

I do agree the behavior change is a bit surprising, but even ignoring
the concurrency issues this INSERT/SELECT is inherently unsafe. There's
no guarantee the INSERT / SELECT will proceed in lockstep row by row,
which is what the counter() function relies on. We'd be well within our
right to e.g. run the SELECT to completion first, stash the results into
a tuplestore (in a Materialize node or something like that), and only
then to actually start processing the INSERT.

While that currently does not happen, it's not unthinkable we might do
something like that in a future reease (e.g. by doing a bit of batching
or vectorization in the executor).

regards

--
Tomas Vondra

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message David Rowley 2024-08-21 02:38:02 Re: server crash on raspberry pi for large queries
Previous Message Matthew Clark 2024-08-20 19:18:42 Re: server crash on raspberry pi for large queries