Re: FDW INSERT batching can change behavior

From: Tomas Vondra <tomas(at)vondra(dot)me>
To: Jason Kim <git(at)jasonk(dot)me>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: FDW INSERT batching can change behavior
Date: 2024-08-13 12:12:27
Message-ID: ef565ecb-4807-491c-8eb7-577eab6522aa@vondra.me
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 8/13/24 06:57, Jason Kim wrote:
> On 2024-08-09T21:55:00+0200, Tomas Vondra wrote:
>> Yeah, we don't seem to check for this. I don't recall if it didn't occur
>> to me we could have DEFAULT on the foreign table.
>>
>> We could/should disable batching, but I'm not quite sure what exactly to
>> check. AFAIK this can happen only when there are default expressions on
>> the foreign table, so maybe that? Or maybe only when the DEFAULT calls a
>> volatile function?
>
> I didn't bother checking, but if CHECK constraints can call volatile functions,
> that is another avenue for differing behavior.
>

While we technically allow volatile functions in CHECK constraints, the
documentation [1] says:

PostgreSQL assumes that CHECK constraints' conditions are immutable,
that is, they will always give the same result for the same input row.

So the expectation is that CHECK only calls immutable stuff.

[1] https://www.postgresql.org/docs/current/ddl-constraints.html

> Here is a completely different example concerning the way batches to different
> partitions are flushed per-partition resulting in out-of-order insertion:
>
> CREATE EXTENSION postgres_fdw;
>
> CREATE SERVER testserver1 FOREIGN DATA WRAPPER postgres_fdw;
> DO $d$
> BEGIN
> EXECUTE $$CREATE SERVER loopback FOREIGN DATA WRAPPER postgres_fdw
> OPTIONS (dbname '$$||current_database()||$$',
> port '$$||current_setting('port')||$$'
> )$$;
> END;
> $d$;
>
> CREATE USER MAPPING FOR CURRENT_USER SERVER loopback;
>
> create table itrtest (a int) partition by range (a);
> create table loct1 (a int check (a % 100 != 3));
> create foreign table remp1 (a int check (a % 100 != 3)) server loopback options (table_name 'loct1');
> create table loct2 (a int check (a % 100 != 3));
> create foreign table remp2 (a int check (a % 100 != 3)) server loopback options (table_name 'loct2');
> alter table itrtest attach partition remp1 for values from (1) to (100);
> alter table itrtest attach partition remp2 for values from (101) to (200);
>
> insert into itrtest values (1), (2), (101), (103), (3);
> truncate itrtest;
> alter server loopback options (add batch_size '3');
> insert into itrtest values (1), (2), (101), (103), (3);
>
> The first insert (non-batched) gives
>
> ERROR: new row for relation "loct2" violates check constraint "loct2_a_check"
> DETAIL: Failing row contains (103).
> CONTEXT: remote SQL command: INSERT INTO public.loct2(a) VALUES ($1)
>
> But the second insert (batched) gives
>
> ERROR: new row for relation "loct1" violates check constraint "loct1_a_check"
> DETAIL: Failing row contains (3).
> CONTEXT: remote SQL command: INSERT INTO public.loct1(a) VALUES ($1), ($2), ($3)
>
> This is because (103) is queued up in the batch and not actually inserted.
> There might be a more severe example than this, but I did not think too much
> about it.
>

I agree this is a bit annoying, because it makes the observed error a
bit unpredictable. But as long as it does not allow inserting rows
violating the CHECK constraint, I wouldn't call this a bug.

The only way to prevent this would be to ensure we inserts actually
happen in the input order, and that's inherently against the whole idea
of batching. So if you want that, you have to disable batching.

regards

--
Tomas Vondra

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Kuntal Ghosh 2024-08-13 12:12:29 Re: BUG #18559: Crash after detaching a partition concurrently from another session
Previous Message PG Bug reporting form 2024-08-13 09:22:41 BUG #18581: psql symbol append_history not found when quitting