Ghost data from failed FDW transactions?

From: Jacob Biesinger <jake(dot)biesinger(at)gmail(dot)com>
To: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Ghost data from failed FDW transactions?
Date: 2024-08-28 01:03:16
Message-ID: CAHYXj6eJiX0jQpq4u7xYc1qy3=zBN+sPhGMTJETAF0fdoSyNUA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi there!

We have a setup where, for compliance reasons, we hoist a portion of data
from several "tenant" databases into a "root" / common / untenanted DB.
Through the magic of postgres_fdw, row triggers, and distributed
transactions, we automatically hoist the needed columns into the untenanted
DB whenever there are changes in any of the tenant DBs. The root DB is
never written to outside of these triggers, and act as a sort of
cross-tenant index.

I'm scratching my head at a few rows in the root DB, where it seems the
corresponding tenant transaction rolled back, but the root DB transaction
committed -- there is no row in the tenant but the root DOES have one. I
don't have a smoking gun just yet, but this is not the first time we've
seen this issue.

Before I jump into particulars, does this sound like expected behavior? We
run SERIALIZABLE txn level everywhere (set at the cluster level).

Thanks so much in advance for any insights here!

Here's my setup:

## info + cluster-level flags
GCP cloudsql
postgres version: 15.7
default_transaction_isolation: serializable

## The tenant DB:

CREATE TABLE
"devices" (
"orgId" TEXT NOT NULL,
"patientId" TEXT NOT NULL,
"deviceId" TEXT NOT NULL,
"data" JSONB NOT NULL,
PRIMARY KEY ("orgId", "patientId", "deviceId")
);

## The root DB:

CREATE TABLE IF NOT EXISTS "devices" (
"orgId" TEXT NOT NULL,
"patientId" TEXT NOT NULL,
"deviceId" TEXT NOT NULL,
"data" JSONB NOT NULL,
"serialNumber" TEXT NOT NULL GENERATED ALWAYS AS ("data"->>'serialNumber')
STORED,
"status" TEXT NOT NULL GENERATED ALWAYS AS ("data"->>'status') STORED,
PRIMARY KEY ("orgId", "patientId", "deviceId"),

CONSTRAINT "deviceIdMatches" CHECK ("data"->>'id' = "deviceId"),
CONSTRAINT "patientIdMatches" CHECK ("data"->>'patientId' = "patientId"),

-- Prevent duplicate serial numbers that are simultaneously "active"
EXCLUDE ( "serialNumber" WITH = ) WHERE ("status" = 'active')
);

## FDW connection from tenant DB to the root:

CREATE EXTENSION IF NOT EXISTS postgres_fdw;

CREATE SERVER IF NOT EXISTS "fdw_server__root" FOREIGN DATA WRAPPER
postgres_fdw OPTIONS (
host '${instance-ip-address}',
dbname '${root-db}',
updatable 'true',
truncatable 'false',
keep_connections 'off'
);

CREATE USER MAPPING FOR "${remote-user}" SERVER "fdw_server__root" OPTIONS(
user '${remote-user}', password '$${PGPASSWORD}'
);

CREATE SCHEMA IF NOT EXISTS "rootDb";

GRANT USAGE ON SCHEMA "rootDb" TO "${user-name}";
GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA "rootDb" TO
"${user-name}";
ALTER DEFAULT PRIVILEGES IN SCHEMA "rootDb" GRANT SELECT, INSERT, UPDATE,
DELETE ON TABLES TO "${user-name}";

IMPORT FOREIGN SCHEMA "public" LIMIT TO (devices)
FROM SERVER "fdw_server__root"
INTO "rootDb";

## Trigger setup on the tenant DB, hoisting rows when modified into the
root table:

-- Set up a trigger which hoists tenant devices into the rootDb
CREATE OR REPLACE FUNCTION hoist_devices() RETURNS TRIGGER
AS $hoist_devices$
BEGIN
IF (TG_OP IN ('UPDATE', 'DELETE')) THEN
DELETE FROM "rootDb"."devices"
WHERE "orgId" = OLD."orgId"
AND "patientId" = OLD."patientId"
AND "deviceId" = OLD."deviceId";
END IF;

IF (TG_OP IN ('INSERT', 'UPDATE')) THEN
INSERT INTO "rootDb"."devices" ("orgId", "patientId", "deviceId", "data")
SELECT
NEW."orgId",
NEW."patientId",
NEW."deviceId",
NEW."data";
END IF;

RETURN NEW;
END;
$hoist_devices$ LANGUAGE plpgsql;

CREATE OR REPLACE TRIGGER hoist_devices_insert_update_delete
AFTER INSERT OR UPDATE OR DELETE ON "devices"
FOR EACH ROW
EXECUTE FUNCTION hoist_devices();

## A particular endpoint attempts insertions like:

BEGIN;

INSERT INTO "devices" ("orgId", "patientId", "deviceId", "data")
VALUES (
'org1',
'patient1',
'device1',
'{"id": "device1", "patientId": "patient1", "serialNumber": "12345",
"status": "active" }'
);

COMMIT;

--
Jake Biesinger

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Richard Guo 2024-08-28 03:30:45 Re: Significant Execution Time Difference Between PG13.14 and PG16.4 for Query on information_schema Tables.
Previous Message David Rowley 2024-08-27 23:57:53 Re: Significant Execution Time Difference Between PG13.14 and PG16.4 for Query on information_schema Tables.