postgres_fdw: Provide better emulation of READ COMMITTED behavior

From: Etsuro Fujita <etsuro(dot)fujita(at)gmail(dot)com>
To: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: postgres_fdw: Provide better emulation of READ COMMITTED behavior
Date: 2024-12-05 09:41:04
Message-ID: CAPmGK14Nj7vuJELKuhyiwMMhat8SsyUY+88yAyTZLdqXG0xYuw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

postgres_fdw uses REPEATABLE READ isolation level for the remote
transaction when the local transaction has READ COMMITTED isolation
level, for the reason described in the comments for
begin_remote_xact() in connection.c:

/*
* Start remote transaction or subtransaction, if needed.
*
* Note that we always use at least REPEATABLE READ in the remote session.
* This is so that, if a query initiates multiple scans of the same or
* different foreign tables, we will get snapshot-consistent results from
* those scans. A disadvantage is that we can't provide sane emulation of
* READ COMMITTED behavior --- it would be nice if we had some other way to
* control which remote queries share a snapshot.
*/

But as mentioned above, this causes unexpected behavior like this:

S1: CREATE TABLE t1 (c1 TEXT);
S1: CREATE FOREIGN TABLE ft1 (c1 TEXT) SERVER loopback OPTIONS
(table_name 't1');
S1: INSERT INTO ft1 VALUES ('foo');
S2: START TRANSACTION ISOLATION LEVEL READ COMMITTED;
S2: SELECT * FROM ft1;
c1
-----
foo
(1 row)

Looks good, but:

S1: INSERT INTO ft1 VALUES ('bar');
S2: SELECT * FROM ft1;
c1
-----
foo
(1 row)

The SELECT query would be expected to retrieve not only the row
(‘foo’) but the row (‘bar’) inserted by session S1 just before, but
retrieves only the row (‘foo’).

I would like to propose a simple solution for this issue. The idea
for the solution is that while postgres_fdw uses REPEATABLE READ for
the remote transaction as before, it refreshes the snapshot for the
transaction so that the same snapshot is shared by remote queries from
within a top-level query (or a set of queries expanded from it by
rewrite rules) received via simple/extended query protocol. To do
this, changes I would like to make to the core and postgres_fdw code
is:

* Add IDs for iterations of the PostgresMain() loop for extensions
like postgres_fdw to know which top-level query is currently being
processed.
* Add a function pg_refresh_snapshot() to refresh the snapshot for a
REPEATABLE READ transaction. (This function would cause phantom read,
which is not allowed to occur at REPEATABLE READ isolation level in
Postgres, but is allowed by the SQL standard.)
* Modify postgres_fdw so that when encountering a new top-level query,
it refreshes the snapshot for the remote transaction by using
pg_refresh_snapshot() before sending the first query from within the
top-level query if it is safe to do so.

Attached is a WIP patch for that. With the patch we have:

S1: INSERT INTO ft1 VALUES ('foo');
S2: START TRANSACTION ISOLATION LEVEL READ COMMITTED;
S2: SELECT * FROM ft1;
c1
-----
foo
(1 row)

S1: INSERT INTO ft1 VALUES ('bar');
S2: SELECT * FROM ft1;
c1
-----
foo
bar
(2 rows)

The SELECT query retrieves both rows!

Comments welcome! Maybe I am missing something, though.

Best regards,
Etsuro Fujita

Attachment Content-Type Size
postgres_fdw-emulate-RC-behavior-WIP.patch application/octet-stream 23.0 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jakub Wartak 2024-12-05 09:43:41 doc: Mention clock synchronization recommendation for hot_standby_feedback
Previous Message vignesh C 2024-12-05 09:39:22 Re: SQL:2011 application time