From: | Simon Riggs <simon(at)2ndquadrant(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | pgsql-committers <pgsql-committers(at)lists(dot)postgresql(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: pgsql: Compute XID horizon for page level index vacuum on primary. |
Date: | 2019-03-29 09:37:11 |
Message-ID: | CANP8+jKUF-DE9um44zEfxOY277byx3b4NMFrY+gwj=+=Yc=0wQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-committers pgsql-hackers |
On Wed, 27 Mar 2019 at 00:06, Andres Freund <andres(at)anarazel(dot)de> wrote:
> Compute XID horizon for page level index vacuum on primary.
>
> Previously the xid horizon was only computed during WAL replay.
This commit message was quite confusing. It took me a while to realize this
relates to btree index deletes and that what you mean is that we are
calculcating the latestRemovedXid for index entries. That is related to but
not same thing as the horizon itself. So now I understand the "was computed
only during WAL replay" since it seemed obvious that the xmin horizon was
calculcated regularly on the master, but as you say the latestRemovedXid
was not.
Now I understand, I'm happy that you've moved this from redo into mainline.
And you've optimized it, which is also important (since performance was the
original objection and why it was placed in redo). I can see you've removed
duplicate code in hash indexes as well, which is good.
The term "xid horizon" was only used once in the code in PG11. That usage
appears to be a typo, since in many other places the term "xmin horizon" is
used to mean the point at which we can finally mark tuples as dead. Now we
have some new, undocumented APIs that use the term "xid horizon" yet still
code that refers to "xmin horizon", with neither term being defined. I'm
hoping you'll do some later cleanup of that to avoid confusion.
While trying to understand this, I see there is an even better way to
optimize this. Since we are removing dead index tuples, we could alter the
killed index tuple interface so that it returns the xmax of the tuple being
marked as killed, rather than just a boolean to say it is dead. Indexes can
then mark the killed tuples with the xmax that killed them rather than just
a hint bit. This is possible since the index tuples are dead and cannot be
used to follow the htid to the heap, so the htid is redundant and so the
block number of the tid could be overwritten with the xmax, zeroing the
itemid. Each killed item we mark with its xmax means one less heap fetch we
need to perform when we delete the page - it's possible we optimize that
away completely by doing this.
Since this point of the code is clearly going to be a performance issue it
seems like something we should do now.
--
Simon Riggs http://www.2ndQuadrant.com/
<http://www.2ndquadrant.com/>
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2019-03-29 10:19:52 | pgsql: Fix incorrect code in new REINDEX CONCURRENTLY code |
Previous Message | Michael Paquier | 2019-03-29 08:09:27 | Re: pgsql: REINDEX CONCURRENTLY |
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2019-03-29 09:50:31 | Re: propagating replica identity to partitions |
Previous Message | Peter Eisentraut | 2019-03-29 09:34:49 | Re: patch to allow disable of WAL recycling |