From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com> |
Cc: | pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Setting visibility map in VACUUM's second phase |
Date: | 2012-12-06 18:01:33 |
Message-ID: | 6230.1354816893@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com> writes:
> So the idea that the patch implements is this. When we scan pages in
> the first phase of vacuum, if we find a page that has all-visible
> tuples but also has one or more dead tuples that we know the second
> phase of vacuum will remove, we mark such page with a special flag
> called PD_ALL_VISIBLE_OR_DEAD (I'm not married to the name, so feel
> free to suggest a better name). What this flag tells the second phase
> of vacuum is to mark the page all-visible and set the VM bit unless
> some intermediate action on the page again inserts a not-all-visible
> tuple. If such an action happens, the PD_ALL_VISIBLE_OR_DEAD flag will
> be cleared by that backend and the second phase of vacuum will not set
> all-visible flag and VM bit.
This seems overly complicated, as well as a poor use of a precious page
header flag bit. Why not just recheck all-visibility status after
performing the deletions? Keep in mind that even if there were some
not-all-visible tuples when we were on the page the first time, they
might have become all-visible by the time we return. So this is going
out of its way to produce a less-than-optimal result.
> The patch itself is quite small and works as intended. One thing that
> demanded special handling is the fact that visibilitymap_set()
> requires a cutoff xid to tell the Hot Standby to resolve conflicts
> appropriately. We could have scanned the page again during the second
> phase to compute the cutoff xid, but I thought we can overload the
> pd_prune_xid field in the page header to store this information which
> is already computed in the first phase.
And this is *far* too cute. The use of that field is complicated enough
already without having its meaning vary depending on randomly-designed
flag bits. And it's by no means obvious that using a by-now-old value
when we return to the page is a good idea anyway (or even correct).
I think taking a second whack at setting the visibility bit is a fine
idea, but let's drop all the rest of this premature optimization.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2012-12-06 18:08:51 | Re: How to check whether the row was modified by this transaction before? |
Previous Message | Kevin Grittner | 2012-12-06 18:00:08 | Re: Functional dependency in GROUP BY through JOINs |