From: | Markus Schiltknecht <markus(at)bluegap(dot)ch> |
---|---|
To: | Gokulakannan Somasundaram <gokul007(at)gmail(dot)com> |
Cc: | pgsql-hackers list <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Some ideas about Vacuum |
Date: | 2008-01-09 12:21:29 |
Message-ID: | 4784BC49.5090701@bluegap.ch |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
Gokulakannan Somasundaram wrote:
> If we can ask the Vacuum process to scan
> the WAL log, it can get all the relevant details on where it needs to
> go.
You seem to be assuming that only few tuples have changed between
vacuums, so that WAL could quickly guide the VACUUM processes to the
areas where cleaning is necessary.
Let's drop that assumption, because by default, autovacuum_scale_factor
is 20%, so a VACUUM process normally kicks in after 20% of tuples
changed (disk space is cheap, I/O isn't). Additionally, there's a
default nap time of one minute - and VACUUM is forced to take at least
that much of a nap.
So it's easily possible having more dead tuples, than live ones. In such
cases, scanning the WAL can easily takes *longer* than scanning the
table, because the amount of WAL to read would be bigger.
> One main restriction it places on the WAL Logs is that the WAL Log
> needs to be archived only after all the transactions in it completes. In
> other words, WAL logs need to be given enough space, to survive the
> longest transaction of the database. It is possible to avoid this
> situation by asking the Vacuum process to take the necessary information
> out of WAL log and store it somewhere and wait for the long running
> transaction to complete.
That would result in even more I/O...
> The information of interest in WAL is only the table
> inserts/updates/deletes. So if everyone accepts that this is a good
> idea, till this point, there is a point in reading further.
Well, that's the information of interest, the question is where to store
that information. Maintaining a dead space map looks a lot cheaper to
me, than relying on the WAL to store that information.
> Ultimately, what has been achieved till now is that we have made the
> sequential scans made by the Vacuum process on each table into a few
> random i/os. Of course there are optimizations possible to group the
> random i/os and find some sequential i/o out of it. But still we need to
> do a full index scan for all those indexes out there. HOT might have
> saved some work over there. But i am pessimistic here and wondering how
> it could have been improved. So it just strikes me, we can do the same
> thing which we did just with the tables. Convert a seq scan of the
> entire table into a random scan of few blocks. We can read the necessary
> tuple information from the tuples, group them and hit at the index in
> just those blocks and clean it up.
Sorry, I don't quite get what you are talking about here. What do
indexes have to do with dead space? Why not just keep acting on the
block level?
> I can already hear people, saying that it is not always possible to
> go back to index from table. There is this culprit called unstable
> function based indexes.
No, there's no such thing. Citing [1]: "All functions and operators used
in an index definition must be "immutable", that is, their results must
depend only on their arguments and never on any outside influence".
Of course, you can mark any function IMMUTABLE and get unstable function
based indexes, but that turns into a giant foot gun very quickly.
> P.S.: Let the objections/opposing views have a subtle reduction in its
> harshness.
I'm just pointing at things that are in conflict with my knowledge,
assumptions and believes, all which might be erroneous, plain wrong or
completely mad. ;-)
Regards
Markus
[1]: the Very Fine Postgres Manual on CREATE INDEX:
http://www.postgresql.org/docs/8.3/static/sql-createindex.html
From | Date | Subject | |
---|---|---|---|
Next Message | Simon Riggs | 2008-01-09 12:56:48 | Re: Dynamic Partitioning using Segment Visibility Maps |
Previous Message | Michael Akinde | 2008-01-09 10:50:01 | Re: VACUUM FULL out of memory |