From: | "Pavan Deolasee" <pavan(dot)deolasee(at)gmail(dot)com> |
---|---|
To: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, pgsql-patches(at)postgresql(dot)org |
Subject: | HOT WIP Patch - version 2 |
Date: | 2007-02-20 06:38:14 |
Message-ID: | 2e78013d0702192238i5aefec84j1275b9a96e199373@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-patches |
Reposting - looks like the message did not get through in the first
attempt. My apologies if multiple copies are received.
This is the next version of the HOT WIP patch. Since the last patch that
I sent out, I have implemented the HOT-update chain pruning mechanism.
When following a HOT-update chain from the index fetch, if we notice that
the root tuple is dead and it is HOT-updated, we try to prune the chain to
the smallest possible length. To do that, the share lock is upgraded to an
exclusive lock and the tuple chain is followed till we find a
live/recently-dead
tuple. At that point, the root t_ctid is made point to that tuple. In order
to
preserve the xmax/xmin chain, the xmax of the root tuple is also updated
to xmin of the found tuple. Since this xmax is also < RecentGlobalXmin
and is a committed transaction, the visibility of the root tuple still
remains
the same.
The intermediate heap-only tuples are removed from the HOT-update chain.
The HOT-updated status of these tuples is cleared and their respective
t_ctid are made point to themselves. These tuples are not reachable now
and ready for vacuuming. This entire action is logged in a single
WAL record.
During vacuuming, we keep track of number of root tuples vacuumed.
If this count is zero, then the index cleanup step is skipped. This
would avoid unnecessary index scans whenever possible.
This patch should apply cleanly on current CVS head and pass all regression
tests. I am still looking for review comments from the first WIP patch. If
anyone
has already looked through it and is interested in the incremental changes,
please let me know. I can post that.
Whats Next ?
-----------------
ISTM that the basic HOT-updates and ability to prune the HOT-update chain,
should help us reduce the index bloat, limit the overhead of ctid following
in
index fetch and efficiently vacuum heap-only tuples. IMO the next important
but rather less troublesome thing to tackle is to reuse space within a block
without complete vacuum of the table. This would help us do much more
HOT-updates and thus further reduce index/heap bloat.
I am thinking of reusing the DEAD heap-only tuples which gets removed from
the HOT-update chain as part of pruning operation. Since these tuples, once
removed from the chain, are neither reachable nor have any index references,
could be readily used for storing newer versions of the same or other rows
in
the block. How about setting LP_DELETE on these tuples as part of the
prune operation ? LP_DELETE is unused for heap tuples, if I am not
mistaken. Other information like length and offset are is maintained as it
is.
When we run out space for update-within-the-block, we traverse
through all the line pointers looking for LP_DELETEd items. If any of these
items have space large enough to store the new tuple, that item is reused.
Does anyone see any issue with doing this ? Also, any suggestions
about doing it in a better way ?
If the page gets really fragmented, we can try to grab a VACUUM-strength
lock on the page and de-fragment it. The lock is tried conditionally to
avoid
any deadlocks. This is done in the heap_update() code path, so would add
some overhead, but may still prove better than putting the tuple in a
different block and having corresponding index insert(s). Also, since we are
more concerned about the large tables, the chances of being able to upgrade
the exclusive lock to vacuum-strength lock are high. Comments ?
If there are no objections, I am planning to work on the first part
while Nikhil would take up the second task of block level retail-vacuum.
Your comments on these issues and the patch are really appreciated.
Thanks,
Pavan
--
EnterpriseDB http://www.enterprisedb.com
Attachment | Content-Type | Size |
---|---|---|
NewHOT-v2.0.patch.gz | application/x-gzip | 15.2 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Treat | 2007-02-20 06:44:04 | Re: Multiple Storage per Tablespace, or Volumes |
Previous Message | Robert Treat | 2007-02-20 06:36:54 | Re: Multiple Storage per Tablespace, or Volumes |
From | Date | Subject | |
---|---|---|---|
Next Message | Hannu Krosing | 2007-02-20 07:48:56 | Re: [HACKERS] HOT WIP Patch - version 2 |
Previous Message | FAST PostgreSQL | 2007-02-20 05:44:54 | Re: WIP patch - INSERT-able log statements |