Re: BUG #13844: Logical decoding bug with subxact + row locking

From: Andres Freund <andres(at)anarazel(dot)de>
To: jarred(at)webriots(dot)com
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: BUG #13844: Logical decoding bug with subxact + row locking
Date: 2016-02-07 11:42:17
Message-ID: 20160207114217.pct2ldvbtdubriy3@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi,

On 2016-01-05 03:32:49 +0000, jarred(at)webriots(dot)com wrote:
> This is a pretty serious logical decoding bug that bit us pretty hard
> in production.
>
> Run the following code:
>
>
> ----------------------------------------------------------------------------
> CREATE TABLE foo(a SERIAL PRIMARY KEY, b TEXT);
>
> SELECT pg_create_logical_replication_slot('regression_slot',
> 'test_decoding');
>
> INSERT INTO foo (b) VALUES ('john');
>
> BEGIN;
> SELECT * FROM foo WHERE b = 'john' FOR UPDATE;
> SAVEPOINT sp1;
> INSERT INTO foo (b) VALUES ('jane');
> ROLLBACK TO SAVEPOINT sp1;
> INSERT INTO foo (b) VALUES ('doe');
> END;
>
> SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL,
> NULL);
>
> ----------------------------------------------------------------------------
>
> The last select above should return the decoded WAL, but instead it
> returns:
>
>
> ----------------------------------------------------------------------------
> ERROR: subxact logged without previous toplevel record
>
> ----------------------------------------------------------------------------

Ugh, that's an annoying bug. Working on producing a fix.

The problem is that logical decoding expects to know about toplevel xids
before subtransaction/savepoint xids. Which is enforced in the WAL
logging code, so that's ok. But the problem is that right now the WAL
produced looks like (abbreviated):
rmgr: Heap tx: 585, lsn: 0/01A304A8, desc: LOCK off 1: xid 585 LOCK_ONLY EXCL_LOCK KEYS_UPDATED , blkref #0: rel 1663/12385/24607 blk 0
rmgr: Heap tx: 586, lsn: 0/01A30658, desc: INSERT off 2, blkref #0: rel 1663/12385/24607 blk 0 FPW
rmgr: Transaction tx: 586, lsn: 0/01A30798, desc: ABORT 2016-02-07 12:21:18.624045 CET
rmgr: Standby tx: 0, lsn: 0/01A307C0, desc: RUNNING_XACTS nextXid 587 latestCompletedXid 586 oldestRunningXid 585; 1 xacts: 585
rmgr: Heap tx: 587, lsn: 0/01A307F8, desc: INSERT off 3, blkref #0: rel 1663/12385/24607 blk 0
rmgr: Btree tx: 587, lsn: 0/01A30838, desc: INSERT_LEAF off 3, blkref #0: rel 1663/12385/24614 blk 1
rmgr: Transaction tx: 585, lsn: 0/01A30878, desc: COMMIT 2016-02-07 12:21:28.062978 CET; subxacts: 587

and decode.c doesn't care about LOCK records so far. Which means
reorderbuffer.c doesn't know about the relevant xid. Hm.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message John R Pierce 2016-02-07 11:59:25 Re: BUG #13932: German ß not a valid character in psql
Previous Message Burkhardt.Renz 2016-02-07 11:14:37 BUG #13932: German ß not a valid character in psql