From: | amul sul <sulamul(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Stephen Frost <sfrost(at)snowman(dot)net>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [HACKERS] Restrict concurrent update/delete with UPDATE of partition key |
Date: | 2018-02-02 08:41:47 |
Message-ID: | CAAJ_b97w_GGV-k4ErxWTpz5sAgFJ4aUYMx0khfySvANmWRzsag@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi Amit,
Sorry for the delayed response.
On Fri, Jan 26, 2018 at 11:58 AM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> On Wed, Jan 24, 2018 at 12:44 PM, amul sul <sulamul(at)gmail(dot)com> wrote:
>> On Tue, Jan 23, 2018 at 7:01 PM, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>> On Fri, Jan 12, 2018 at 11:43 AM, amul sul <sulamul(at)gmail(dot)com> wrote:
[....]
> I think you can manually (via debugger) hit this by using
> PUBLICATION/SUBSCRIPTION syntax for logical replication. I think what
> you need to do is in node-1, create a partitioned table and subscribe
> it on node-2. Now, perform an Update on node-1, then stop the logical
> replication worker before it calls heap_lock_tuple. Now, in node-2,
> update the same row such that it moves the row. Now, continue the
> logical replication worker. I think it should hit your new code, if
> not then we need to think of some other way.
>
I am able to hit the change log using above steps. Thanks a lot for the
step by step guide, I really needed that.
One strange behavior I found in the logical replication which is reproducible
without attached patch as well -- when I have updated on node2 by keeping
breakpoint before the heap_lock_tuple call in replication worker, I can see
a duplicate row was inserted on the node2, see this:
== NODE 1 ==
postgres=# insert into foo values(1, 'initial insert');
INSERT 0 1
postgres=# select tableoid::regclass, * from foo;
tableoid | a | b
----------+---+----------------
foo1 | 1 | initial insert
(1 row)
=== NODE 2 ==
postgres=# select tableoid::regclass, * from foo;
tableoid | a | b
----------+---+----------------
foo1 | 1 | initial insert
(1 row)
== NODE 1 ==
postgres=# update foo set a=2, b='node1_update' where a=1;
UPDATE 1
<---- BREAK POINT BEFORE heap_lock_tuple IN replication worker --->
== NODE 2 ==
postgres=# update foo set a=2, b='node2_update' where a=1;
<---- RELEASE BREAK POINT --->
postgres=# 2018-02-02 12:35:45.050 IST [91786] LOG: tuple to be
locked was already moved to another partition due to concurrent
update, retrying
postgres=# select tableoid::regclass, * from foo;
tableoid | a | b
----------+---+--------------
foo2 | 2 | node2_update
foo2 | 2 | node1_update
(2 rows)
== NODE 1 ==
postgres=# select tableoid::regclass, * from foo;
tableoid | a | b
----------+---+--------------
foo2 | 2 | node1_update
(1 row)
I am thinking to report this in a separate thread, but not sure if
this is already known behaviour or not.
== schema to reproduce above case ==
-- node1
create table foo (a int2, b text) partition by list (a);
create table foo1 partition of foo for values IN (1);
create table foo2 partition of foo for values IN (2);
insert into foo values(1, 'initial insert');
CREATE PUBLICATION update_row_mov_pub FOR ALL TABLES;
ALTER TABLE foo REPLICA IDENTITY FULL;
ALTER TABLE foo1 REPLICA IDENTITY FULL;
ALTER TABLE foo2 REPLICA IDENTITY FULL;
-- node2
create table foo (a int2, b text) partition by list (a);
create table foo1 partition of foo for values IN (1);
create table foo2 partition of foo for values IN (2);
CREATE SUBSCRIPTION update_row_mov_sub CONNECTION 'host=localhost
dbname=postgres' PUBLICATION update_row_mov_pub;
== END==
Updated patch attached -- correct changes in execReplication.c.
Regards,
Amul Sul
Attachment | Content-Type | Size |
---|---|---|
0001-Invalidate-ip_blkid-v5-wip2.patch | application/octet-stream | 12.7 KB |
0002-isolation-tests-v3.patch | application/octet-stream | 9.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Simon Riggs | 2018-02-02 08:46:10 | Re: [HACKERS] make async slave to wait for lsn to be replayed |
Previous Message | Michael Paquier | 2018-02-02 07:07:28 | Re: CALL stmt, ERROR: unrecognized node type: 113 bug |