From: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
---|---|
To: | Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com> |
Cc: | Aleksander Alekseev <aleksander(at)timescale(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Amul Sul <sulamul(at)gmail(dot)com>, vignesh C <vignesh21(at)gmail(dot)com> |
Subject: | Re: Dump-restore loosing 'attnotnull' bit for DEFERRABLE PRIMARY KEY column(s). |
Date: | 2024-03-07 13:00:15 |
Message-ID: | 202403071300.74eklgyctmwc@alvherre.pgsql |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2024-Mar-05, Dean Rasheed wrote:
> So I think RelationGetIndexAttrBitmap() should include deferrable PKs,
I tried this, but it doesn't actually lead to a good place, because if
we allow deferrable PKs to identify rows, then they are not useful to
find the tuple to update when replicating. Consider the following case:
$node_publisher->safe_psql('postgres',
'create table deferred_pk (id int primary key initially deferred, hidden int, value text)');
$node_subscriber->safe_psql('postgres',
'create table deferred_pk (id int primary key initially deferred, hidden int, value text)');
$node_subscriber->safe_psql('postgres',
'alter subscription tap_sub refresh publication');
$node_publisher->safe_psql('postgres',
"insert into deferred_pk (id, hidden, value) values (1, 1, 'first')");
$node_publisher->wait_for_catchup('tap_sub');
$node_publisher->safe_psql('postgres',
qq{
begin;
insert into deferred_pk values (1, 2, 'conflicting');
update deferred_pk set value = value || ', updated' where id = 1 and hidden = 2;
update deferred_pk set id = 3, value = value || ', updated' where hidden = 2;
commit});
$node_publisher->wait_for_catchup('tap_sub');
my $pubdata = $node_publisher->safe_psql('postgres',
'select * from deferred_pk order by id');
my $subsdata = $node_subscriber->safe_psql('postgres',
'select * from deferred_pk order by id');
is($subsdata, $pubdata, "data is equal");
Here, the publisher's transaction first creates a new record with the
same PK, which only works because the PK is deferred; then we update its
payload column. When this is replicated, the row is identified by the
PK ... but replication actually updates the other row, because it's
found first:
# Failed test 'data is equal'
# at t/003_constraints.pl line 163.
# got: '1|2|conflicting
# 3|2|conflicting, updated, updated'
# expected: '1|1|first
# 3|2|conflicting, updated, updated'
Actually, is that what happened here? I'm not sure, but clearly this is
bogus.
So I think the original developers of REPLICA IDENTITY had the right
idea here (commit 07cacba983ef), and we mustn't change this aspect,
because it'll lead to data corruption in replication. Using a deferred
PK for DDL considerations seems OK, but it seems certain that for actual
data replication it's going to be a disaster.
--
Álvaro Herrera PostgreSQL Developer — https://www.EnterpriseDB.com/
From | Date | Subject | |
---|---|---|---|
Next Message | Shlok Kyal | 2024-03-07 13:01:38 | Re: speed up a logical replica setup |
Previous Message | Cédric Villemain | 2024-03-07 12:26:00 | Re: Change prefetch and read strategies to use range in pg_prewarm ... and raise a question about posix_fadvise WILLNEED |