From: | Simon Riggs <simon(at)2ndQuadrant(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Proposal: Another attempt at vacuum improvements |
Date: | 2011-05-25 13:14:22 |
Message-ID: | BANLkTi=Arq+vFwmFO9v7JOdcgFdJYi0UeQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, May 25, 2011 at 1:27 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> At the moment we scan indexes if we have > 0 rows to remove, which is
>> probably wasteful. Perhaps it would be better to keep a running total
>> of rows to remove, by updating pg_stats, then when we hit a certain
>> threshold in total we can do the index scan. So we don't need to
>> remember the TIDs, just remember how many there were and use that to
>> avoid cleaning too vigorously.
>
> That occurred to me, too. If we're being launched by autovacuum then
> we know that a number of updates and deletes equal ~20% (or whatever
> autovacuum_vacuum_scale_factor is set to) of the table size have
> occurred since the last autovacuum. But it's possible that many of
> those were HOT updates, in which case the number of index entries to
> be cleaned up might be much less than 20% of the table size.
> Alternatively, it's possible that we'd be better off vacuuming the
> table more often (say, autovacuum_vacuum_scale_factor=0.10 or 0.08 or
> something) but only doing the index scans every once in a while when
> enough dead line pointers have accumulated. After all, it's the first
> heap pass that frees up most of the space; cleaning dead line pointers
> seems a bit less urgent. But not having done any real analysis of how
> this would work out in practice, I'm not sure whether it's a good idea
> or not.
We know whether a TID was once in the index or not, so we can keep an
exact count. HOT doesn't come into it.
We can remove TIDs from index as well without VACUUM during btree
split avoidance. We can optimise the second scan by skipping htids no
longer present in the index, though we'd need a spare bit to mark
usage that which I'm not sure we have.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Leonardo Francalanci | 2011-05-25 13:43:58 | Re: use less space in xl_xact_commit patch |
Previous Message | Simon Riggs | 2011-05-25 13:05:39 | Re: tackling full page writes |