From: | Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> |
---|---|
To: | PostgreSQL-development <pgsql-hackers(at)postgreSQL(dot)org> |
Subject: | Gin page deletion bug |
Date: | 2013-11-07 20:49:14 |
Message-ID: | 527BFCCA.9000000@vmware.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Gin page deletion fails to take into account that there might be a
search in-flight to the page that is deleted. If the page is reused for
something else, the search can get very confused.
That's pretty difficult to reproduce in a real system, as the window
between releasing a lock on page and following its right-link is very
tight, but by setting a breakpoint with a debugger it's easy. Here's how
I reproduced it:
-----------
1. Put a breakpoint or sleep in entryGetNextItem() function, where it
has released lock on one page and is about to read the next one. I used
this patch:
--- a/src/backend/access/gin/ginget.c
+++ b/src/backend/access/gin/ginget.c
@@ -574,6 +574,9 @@ entryGetNextItem(GinState *ginstate, GinScanEntry entry)
return;
}
+ elog(NOTICE, "about to move right to page %u", blkno);
+ sleep(5);
+
entry->buffer = ReleaseAndReadBuffer(entry->buffer,
ginstate->index,
blkno);
2. Initialize a page with a gin index in suitable state:
create extension btree_gin;
create table foo (i int4);
create index i_gin_foo on foo using gin (i) with (fastupdate = off);
insert into foo select 1 from generate_series(1, 5000);
insert into foo select 2 from generate_series(1, 5000);
set enable_bitmapscan=off; set enable_seqscan=on;
delete from foo where i = 1;
3. Start a query, it will sleep between every page:
set enable_bitmapscan=on; set enable_seqscan=off;
select * from foo where i = 1;
postgres=# select * from foo where i = 1;
NOTICE: about to move right to page 3
NOTICE: about to move right to page 5
...
4. In another session, delete and reuse the pages:
vacuum foo;
insert into foo select 2 from generate_series(1, 10000) g
5. Let the query run to completion. It will return a lot of tuples with
i=2, which should not have matched:
...
NOTICE: about to move right to page 24
NOTICE: about to move right to page 25
i
---
2
2
2
...
-----------
The regular b-tree code solves this by stamping deleted pages with the
current XID, and only allowing them to be reused once that XID becomes
old enough (< RecentGlobalXmin). Another approach might be to grab a
cleanup-strength lock on the left and parent pages when deleting a page,
and requiring search to keep the pin on the page its coming from, until
it has locked the next page.
- Heikki
From | Date | Subject | |
---|---|---|---|
Next Message | Euler Taveira | 2013-11-07 21:51:46 | Re: TODO : Allow parallel cores to be used by vacuumdb [ WIP ] |
Previous Message | Josh Berkus | 2013-11-07 20:42:40 | Re: Changing pg_dump default file format |