Re: Idea for getting rid of VACUUM FREEZE on cold pages

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Idea for getting rid of VACUUM FREEZE on cold pages
Date: 2010-05-27 01:32:07
Message-ID: AANLkTil3irlbJRRWmIjyuCKgaT_msZuPKopwQjms0XC6@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, May 26, 2010 at 8:59 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> Robert Haas <robertmhaas(at)gmail(dot)com> writes:
>> On Wed, May 26, 2010 at 8:06 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>> Yeah.  Neither PD_ALL_VISIBLE nor the visibility map are going to solve
>>> your problem, because they cannot become set without having visited the
>>> page.
>
>> Well, maybe I'm confused here, but arranging things so that we NEVER
>> have to visit the page after initially writing it seems like it's
>> setting the bar almost impossibly high.
>
> Well, that was the use-case that Josh was on about when this idea came
> up: high-volume append-only log tables that in most cases will never be
> read, so his client wants to get rid of the extra I/O for maintenance
> visits to once-written pages.

Well, I'll just note that using PD_ALL_VISIBLE as I'm proposing is
basically equivalent to Josh's original proposal of using an XID epoch
except that it addresses all three of the "obvious issues" which he
noted in his original email; plus it doesn't prevent truncating CLOG
(on the assumption that we rejigger things not to consult clog when
the page is marked PD_ALL_VISIBLE).

> If you're willing to allow one visit and rewrite of each page, then
> we can do that today with maybe a bit of rejiggering of vacuum's
> when-to-freeze heuristics.

Hmm, yeah. Maybe we should freeze when we set PD_ALL_VISIBLE; that
might be just as good, and simpler. Assuming the visibility map is
sufficiently crash-safe/non-buggy, we could then teach VACUUM that
it's OK to advance relfrozenxid even when doing just a partial vacuum
- because any pages that were skipped must contain only frozen tuples.
Previously you've objected to proposals in this direction because
they might destroy forensic information, but maybe we should do it
anyway.

Either way, I think if we do this it *basically* gets rid of
anti-wraparound vacuum. Yeah, we'll still do routine partial vacuums,
but what you won't have is... write the table, vacuum, vacuum, vacuum,
vacuum, OK, everything's visible to everyone, don't need to vacuum any
more... months pass... boom, unexpected full-table vacuum. The
beginning part is the same, but you get rid of the boom at the end.
The only way I see to cut down vacuum activity even further is to
freeze them (and set the visibility map bit) before evicting them from
shared_buffers. That's really the only way to get "write once and
only once", but it's pretty hit or miss, because the xmin horizon
might not advance fast enough to make it actually work.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2010-05-27 01:38:54 Re: functional call named notation clashes with SQL feature
Previous Message Tom Lane 2010-05-27 01:28:05 Re: functional call named notation clashes with SQL feature