From: | "Wood, Dan" <hexpert(at)amazon(dot)com> |
---|---|
To: | "pgsql-bugs(at)postgresql(dot)org" <pgsql-bugs(at)postgresql(dot)org> |
Subject: | Old row version in hot chain become visible after a freeze |
Date: | 2017-08-31 22:36:43 |
Message-ID: | E5711E62-8FDF-4DCA-A888-C200BF6B5742@amazon.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
From: Wood, Dan hexpert(at)amazon(dot)com
I’ve found a bug in Postgres which causes old row versions to appear in a table. DEAD rows in a hot chain are getting frozen and becoming visible. I’ve repro’d this in both 9.6.1 and 11-devel.
The repro consists of two short psql scripts.
While the repro does an explicit VACUUM FREEZE, this bug also happens with autovacuum.
FILE: lock.sql
begin;
select id from t where id=3 for key share;
select pg_sleep(1);
update t set x=x+1 where id=3;
commit;
vacuum freeze t;
select ctid, xmin, xmax, id from t;
FILE: repro.sql
drop table t;
create table t (id int primary key, name char(3), x integer);
insert into t values (1, '111', 0);
insert into t values (3, '333', 0);
\! psql -p 5432 postgres -f lock.sql &
\! psql -p 5432 postgres -f lock.sql &
\! psql -p 5432 postgres -f lock.sql &
\! psql -p 5432 postgres -f lock.sql &
\! psql -p 5432 postgres -f lock.sql &
It’s about 50-50 whether any given run of repro.sql will produce output like:
…
ctid | xmin | xmax | id | x
-------+------+------+----+---
(0,1) | 984 | 0 | 1 | 0
(0,7) | 990 | 0 | 3 | 5
(2 rows)
ctid | xmin | xmax | id | x
-------+------+------+----+---
(0,1) | 984 | 0 | 1 | 0
(0,3) | 986 | 0 | 3 | 1 // This, and x = 2, 3 and 4 came back from the DEAD
(0,4) | 987 | 0 | 3 | 2
(0,5) | 988 | 0 | 3 | 3
(0,6) | 989 | 0 | 3 | 4
(0,7) | 990 | 0 | 3 | 5
(6 rows)
Root cause analysis: lazy_scan_heap() deletes DEAD tuples in heap_page_prune(). However, it is possible for concurrent commits/rollbacks to render a tuple DEAD by the time we reach the switch statement on HeapTupleSatisfiesVacuum(). If such a row IsHotUpdated or IsHeapOnly we can’t delete it below, and must allow a later prune to take care of it.
if (HeapTupleIsHotUpdated(&tuple) || HeapTupleIsHeapOnly(&tuple))
nkeep += 1; // Don't delete, allow later prune to delete it
else
tupgone = true; // We can delete it below
Because tupgone is false we freeze instead of deleting. Freezing a DEAD tuple makes it visible. Here is a comment in heap_prepare_freeze_tuple()
* It is assumed that the caller has checked the tuple with
* HeapTupleSatisfiesVacuum() and determined that it is not HEAPTUPLE_DEAD
* (else we should be removing the tuple, not freezing it).
It is rare that we run into a DEAD tuple in this way during a freeze. More often RECENTLY_DEAD is returned. But we did see this with a more realistic long running test and I was able to create the simplified test case above. Skipping the Freeze on a DEAD tuple that IsHotUpdated or IsHeapOnly does fix the problem. I’ve attached a patch with this fix.
Attachment | Content-Type | Size |
---|---|---|
update_freeze.patch | application/octet-stream | 2.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Geoghegan | 2017-08-31 22:56:20 | Re: Old row version in hot chain become visible after a freeze |
Previous Message | Sandeep Thakkar | 2017-08-31 16:57:14 | Re: BUG #14792: Invalid ssleay32.dll |