From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
Cc: | pgsql-patches(at)postgresql(dot)org, Simon Riggs <simon(at)2ndquadrant(dot)com> |
Subject: | Re: Page at a time index scan |
Date: | 2006-05-08 00:11:17 |
Message-ID: | 15871.1147047077@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-patches |
I've committed a rewritten version of this patch.
Heikki Linnakangas <hlinnaka(at)iki(dot)fi> writes:
> On Fri, 5 May 2006, Tom Lane wrote:
>> btbulkdelete arrives at a page, it need take no special action unless the
>> page is newly split *and* its right-link points to a lower physical
>> address. If that's true, then after vacuuming the page, follow its
>> right-link and vacuum that page; repeat until arriving at a page that is
>> either not newly split or is above the current location of the outer loop.
>> Then return to the outer, sequential-scan loop.
> It'd be a bit more efficient to finish the sequential-scan first, and
> memorize the newly-split pages' right-links as they're encountered. Then
> scan those pages as a separate second pass, or earlier if we run out of
> memory reserved for memorizing them.
I didn't do this. Aside from the extra memory requirement, it's not
apparent to me that it'd make things faster. The disadvantage is that
it would require more page reads than the other way: if you visit the
split page immediately, and note that its right-link is above the
current outer loop location, then you can skip following the right-link
because you know you'll visit the page later. If you postpone then you
have to chase every chain until actually reading a page with an old
cycle ID. I think this extra I/O would likely outweigh any savings from
not interrupting the main scan.
> If btbulkdelete always clears the marker (assuming zero isn't a valid
> value), 16 bits is plenty. Unless a vacuum is aborted, there should never
> be a value older than current value - 1 in the index. We could live with a
> 2-bit counter.
For the moment, the code is only clearing the marker if it's equal to
the current cycle ID. This is sufficient to recognize
definitely-already-processed pages, but it doesn't prevent false
positives in general. If we ever need the space we could narrow the
counter, at the cost of having to expend more I/O to keep the values
cleared.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2006-05-08 00:21:43 | Re: [PATCH] Magic block for modules |
Previous Message | Martijn van Oosterhout | 2006-05-07 21:17:05 | [PATCH] Magic block for modules |