From: | Simon Riggs <simon(at)2ndQuadrant(dot)com> |
---|---|
To: | Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Hot standby and b-tree killed items |
Date: | 2008-12-19 09:38:44 |
Message-ID: | 1229679524.4793.481.camel@ebony.2ndQuadrant |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, 2008-12-19 at 10:49 +0200, Heikki Linnakangas wrote:
> Whenever a B-tree index scan fetches a heap tuple that turns out to be
> dead, the B-tree item is marked as killed by calling _bt_killitems. When
> the page gets full, all the killed items are removed by calling
> _bt_vacuum_one_page.
>
> That's a problem for hot standby. If any of the killed b-tree items
> point to a tuple that is still visible to a running read-only
> transaction, we have the same situation as with vacuum, and have to
> either wait for the read-only transaction to finish before applying the
> WAL record or kill the transaction.
>
> It looks like there's some cosmetic changes related to that in the
> patch, the signature of _bt_delitems is modified, but there's no actual
> changes that would handle that situation. I didn't see it on the TODO on
> the hot standby wiki either. Am I missing something, or the patch?
ResolveRedoVisibilityConflicts() describes the current patch's position
on this point, which on review is wrong, I agree.
It looks like I assumed that _bt_delitems is only called during VACUUM,
which I knew it wasn't. I know I was going to split XLOG_BTREE_VACUUM
into two record types at one point, one for delete, one for vacuum. In
the end I didn't. Anyhow, its wrong.
We have infrastructure in place to make this work correctly, just need
to add latestRemovedXid field to xl_btree_vacuum. So that part is easily
solved.
Thanks for spotting it. More like that please!
--
Simon Riggs www.2ndQuadrant.com
PostgreSQL Training, Services and Support
From | Date | Subject | |
---|---|---|---|
Next Message | Grzegorz Jaskiewicz | 2008-12-19 09:57:42 | Re: possible bug in 8.4 |
Previous Message | Simon Riggs | 2008-12-19 09:22:12 | Re: Sync Rep: First Thoughts on Code |