From: | Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru> |
---|---|
To: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Concurrency bug in amcheck |
Date: | 2020-04-21 09:54:27 |
Message-ID: | CAPpHfdt_OTyQpXaPJcWzV2N-LNeNJseNB-K_A66qG=L518VTFw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi!
I found concurrency bug in amcheck running on replica. When
btree_xlog_unlink_page() replays changes to replica, deleted page is
left with no items. But if amcheck steps on such deleted page
palloc_btree_page() expects it would have items.
(lldb_on_primary) b btbulkdelete
primary=# drop table test;
primary=# create table test as (select random() x from
generate_series(1,1000000) i);
primary=# create index test_x_idx on test(x);
primary=# delete from test;
primary=# vacuum test;
(lldb_on_replica) b bt_check_level_from_leftmost
replica=# select bt_index_check('test_x_idx');
# skip to internal level
(lldb_on_replica) c
(lldb_on_replica) b palloc_btree_page
# skip to non-leftmost page
(lldb_on_replica) c
(lldb_on_replica) c
# concurrently delete btree pages
(lldb_on_primary) c
# continue with pages
(lldb_on_replica) c
Finally replica gets error.
ERROR: internal block 289 in index "test_x_idx" lacks high key and/or
at least one downlink
Proposed fix is attached. Spotted by Konstantin Knizhnik,
reproduction case and fix from me.
------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachment | Content-Type | Size |
---|---|---|
fix_amcheck_concurrency.patch | application/octet-stream | 1.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Thomas Munro | 2020-04-21 10:14:01 | Re: fixing old_snapshot_threshold's time->xid mapping |
Previous Message | Jeevan Ladhe | 2020-04-21 09:35:38 | Re: WIP/PoC for parallel backup |