From: | Andres Freund <andres(at)2ndquadrant(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Jim Nasby <jim(at)nasby(dot)net>, Bruce Momjian <bruce(at)momjian(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Peter Geoghegan <peter(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: [PERFORM] encouraging index-only scans |
Date: | 2013-09-19 22:59:29 |
Message-ID: | 20130919225929.GC11116@awork2.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-performance |
On 2013-09-19 14:39:43 -0400, Robert Haas wrote:
> On Tue, Sep 17, 2013 at 7:10 PM, Andres Freund <andres(at)2ndquadrant(dot)com> wrote:
> > I generally think the current logic for triggering VACUUMs via
> > autovacuum doesn't really make all that much sense in the days where we
> > have the visibility map.
>
> Right now, whether or not to autovacuum is the rest of a two-pronged
> test. The first prong is based on number of updates and deletes
> relative to table size; that triggers a regular autovacuum. The
> second prong is based on age(relfrozenxid) and triggers a
> non-page-skipping vacuum (colloquially, an anti-wraparound vacuum).
And I have some hopes we can get rid of that in 9.4 (that alone would be
worth a bump to 10.0 ;)). I really like Heikki's patch, even if I am
envious that I didn't have the idea :P. Although it needs quite a bit of
work to be ready.
> The typical case in which this doesn't work out well is when the table
> has a lot of inserts but few or no updates and deletes. So I propose
> that we change the first prong to count inserts as well as updates and
> deletes when deciding whether it needs to vacuum the table. We
> already use that calculation to decide whether to auto-analyze, so it
> wouldn't be very novel. We know that the work of marking pages
> all-visible will need to be done at some point, and doing it sooner
> will result in doing it in smaller batches, which seems generally
> good.
Yes, that's a desperately needed change.
The reason I suggested keeping track of the xids of unremovable tuples
is that the current logic doesn't handle that at all. We just
unconditionally set n_dead_tuples to zero after a vacuum even if not a
single row could actually be cleaned out. Which has the effect that we
will not start a vacuum until enough bloat (or after changing this, new
inserts) has collected to start vacuum anew. Which then will do twice
the work.
Resetting n_dead_tuples to the actual remaining dead tuples wouldn't do
much good either - we would just immediately trigger a new vacuum the
next time we check, even if the xmin horizon is still the same.
> However, I do have one concern: it might lead to excessive
> index-vacuuming. Right now, we skip the index vac step only if there
> ZERO dead tuples are found during the heap scan. Even one dead tuple
> (or line pointer) will cause an index vac cycle, which may easily be
> excessive. So I further propose that we introduce a threshold for
> index-vac; so that we only do index vac cycle if the number of dead
> tuples exceeds, say 0.1% of the table size.
Yes, that's a pretty valid concern. But we can't really do it that
easily. a) We can only remove dead line pointers when we know there's no
index pointing to it anymore. Which we only know after the index has
been removed. b) We cannot check the validity of an index pointer if
there's no heap tuple for it. Sure, we could check whether we're
pointing to a dead line pointer, but the random io costs of that are
prohibitive.
Now, we could just mark line pointers as dead and not mark that page as
all-visible and pick it up again on the next vacuum cycle. But that
would suck long-term.
I think the only real solution here is to store removed tuples tids
(i.e. items where we've marked as dead) somewhere. Whenever we've found
sufficient tuples to-be-removed from indexes we do phase 2.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Greg Stark | 2013-09-19 23:13:18 | Re: Dead code or buggy code? |
Previous Message | MauMau | 2013-09-19 22:42:19 | Re: UTF8 national character data type support WIP patch and list of open issues. |
From | Date | Subject | |
---|---|---|---|
Next Message | Bartłomiej Romański | 2013-09-20 00:49:18 | Planner performance extremely affected by an hanging transaction (20-30 times)? |
Previous Message | Kevin Grittner | 2013-09-19 21:36:53 | Re: [PERFORM] encouraging index-only scans |