Re: FDW INSERT batching can change behavior

From: Jason Kim <git(at)jasonk(dot)me>
To: Tomas Vondra <tomas(at)vondra(dot)me>, pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: FDW INSERT batching can change behavior
Date: 2024-08-13 04:57:39
Message-ID: 20240813045739.hbwnscxjqe5ifreu@jasonk.me
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

On 2024-08-09T21:55:00+0200, Tomas Vondra wrote:
> Yeah, we don't seem to check for this. I don't recall if it didn't occur
> to me we could have DEFAULT on the foreign table.
>
> We could/should disable batching, but I'm not quite sure what exactly to
> check. AFAIK this can happen only when there are default expressions on
> the foreign table, so maybe that? Or maybe only when the DEFAULT calls a
> volatile function?

I didn't bother checking, but if CHECK constraints can call volatile functions,
that is another avenue for differing behavior.

Here is a completely different example concerning the way batches to different
partitions are flushed per-partition resulting in out-of-order insertion:

CREATE EXTENSION postgres_fdw;

CREATE SERVER testserver1 FOREIGN DATA WRAPPER postgres_fdw;
DO $d$
BEGIN
EXECUTE $$CREATE SERVER loopback FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (dbname '$$||current_database()||$$',
port '$$||current_setting('port')||$$'
)$$;
END;
$d$;

CREATE USER MAPPING FOR CURRENT_USER SERVER loopback;

create table itrtest (a int) partition by range (a);
create table loct1 (a int check (a % 100 != 3));
create foreign table remp1 (a int check (a % 100 != 3)) server loopback options (table_name 'loct1');
create table loct2 (a int check (a % 100 != 3));
create foreign table remp2 (a int check (a % 100 != 3)) server loopback options (table_name 'loct2');
alter table itrtest attach partition remp1 for values from (1) to (100);
alter table itrtest attach partition remp2 for values from (101) to (200);

insert into itrtest values (1), (2), (101), (103), (3);
truncate itrtest;
alter server loopback options (add batch_size '3');
insert into itrtest values (1), (2), (101), (103), (3);

The first insert (non-batched) gives

ERROR: new row for relation "loct2" violates check constraint "loct2_a_check"
DETAIL: Failing row contains (103).
CONTEXT: remote SQL command: INSERT INTO public.loct2(a) VALUES ($1)

But the second insert (batched) gives

ERROR: new row for relation "loct1" violates check constraint "loct1_a_check"
DETAIL: Failing row contains (3).
CONTEXT: remote SQL command: INSERT INTO public.loct1(a) VALUES ($1), ($2), ($3)

This is because (103) is queued up in the batch and not actually inserted.
There might be a more severe example than this, but I did not think too much
about it.

Jason

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message PG Bug reporting form 2024-08-13 09:22:41 BUG #18581: psql symbol append_history not found when quitting
Previous Message Jason Kim 2024-08-13 04:17:51 a row is not inserted in nested INSERT ON CONFLICT