From: | Markus Schiltknecht <markus(at)bluegap(dot)ch> |
---|---|
To: | Gregory Stark <stark(at)enterprisedb(dot)com> |
Cc: | Gokulakannan Somasundaram <gokul007(at)gmail(dot)com>, pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Some ideas about Vacuum |
Date: | 2008-01-09 16:40:05 |
Message-ID: | 4784F8E5.4020403@bluegap.ch |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
Gregory Stark wrote:
> That's an interesting thought. I think your caveats are right but with some
> more work it might be possible to work it out. For example if a background
> process processed the WAL and accumulated an array of possibly-dead tuples to
> process in batch. It would wait whenever it sees an xid which isn't yet past
> globalxmin, and keep accumulating until it has enough to make it worthwhile
> doing a pass.
I don't understand why one would want to go via the WAL, that only
creates needless I/O. Better accumulate the data right away, during the
inserts, updates and deletes. Spilling the accumulated data to disk, if
absolutely required, would presumably still result in less I/O.
> I think a bigger issue with this approach is that it ties all your tables
> together. You can't process one table frequently while some other table has
> some long-lived deleted tuples.
Don't use the WAL as the source of that information and that's issue's gone.
> I'm also not sure it really buys us anything over having a second
> dead-space-map data structure. The WAL is much larger and serves other
> purposes which would limit what we can do with it.
Exactly.
>> You seem to be assuming that only few tuples have changed between vacuums, so
>> that WAL could quickly guide the VACUUM processes to the areas where cleaning
>> is necessary.
>>
>> Let's drop that assumption, because by default, autovacuum_scale_factor is 20%,
>> so a VACUUM process normally kicks in after 20% of tuples changed (disk space
>> is cheap, I/O isn't). Additionally, there's a default nap time of one minute -
>> and VACUUM is forced to take at least that much of a nap.
>
> I think this is exactly backwards. The goal should be to improve vacuum, then
> adjust the autovacuum_scale_factor as low as we can. As vacuum gets cheaper
> the scale factor can go lower and lower.
But you can't lower it endlessly, it's still a compromise, because it
also means reducing the amount of tuples being cleaned per scan, which
is against the goal of minimizing overall I/O cost of vacuuming.
> We shouldn't allow the existing
> autovacuum behaviour to control the way vacuum works.
That's a point.
> As a side point, "disk is cheap, I/O isn't" is a weird statement. The more
> disk you use the more I/O you'll have to do to work with the data.
That's only true, as long as you need *all* your data to work with it.
> I still
> maintain the default autovacuum_scale_factor is *far* to liberal. If I had my
> druthers it would be 5%. But that's mostly informed by TPCC experience, in
> real life the actual value will vary depending on the width of your records
> and the relative length of your transactions versus transaction rate. The TPCC
> experience is with ~ 400 byte records and many short transactions.
Hm.. 5% vs 20% would mean 4x as many vacuum scans, but only a 15% growth
in size (105% vs 120%), right? Granted, those 15% are also taken from
memory and caches, resulting in additional I/O... Still these numbers
are surprising me. Or am I missing something?
Regards
Markus
From | Date | Subject | |
---|---|---|---|
Next Message | Chris Browne | 2008-01-09 16:47:31 | Re: Dynamic Partitioning using Segment Visibility Maps |
Previous Message | Markus Schiltknecht | 2008-01-09 16:30:43 | Re: Named vs Unnamed Partitions |