Is there a way to make VACUUM run completely outside transaction

From: Hannu Krosing <hannu(at)tm(dot)ee>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Is there a way to make VACUUM run completely outside transaction
Date: 2005-02-07 01:45:26
Message-ID: 1107740727.8087.30.camel@fuji.krosing.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi

As VACUUM is not something that can be rolled back, could we not make it
run completely outside transactions. It already needs to be run outside
a transaction block.

I try to explain the problem more thoroughly below (I'm quite sleepy, so
the explanation may be not too clear ;)

My problem is as follows:

I have a fairly write intensive database that has two kinds of tables -

1) some with a small working set (thousands of rows), but their get
constant traffic of hundreds of inserts/updates/deletes per second.

2) and some with bigger tables (a few million rows) which get smaller
load of updates.

I keep the first type of tables filesize small/dead tuples count low by
running vacuum on them in a tight loop with 15 sec sleeps in between
vacuuming individual tables)

And I keep the 2nd type balanced by running another loop of vacuums on
them.

It worked quite well for a while, but as the tables grow, the vacuums on
the 2nd kind of tables run long enough for the first kind of tables to
accumulate hundreds of thousands dead tuples, which can't be freed
because the vacuums on 2nd kind run in their own long transactions,
keeping the oldest active transaction id smaller than needed.

And the new provisions of making VACUUM less intrusive by allowing
delays in vacuuming make this problem worse, by kind of priority
inverion - the less intrusive vacuum runs longer, thereby keeping its
transaction open longer and thereby being *more* intrusive by not
allowing old tuples to be deleted by another vacuum command.

I can see two ways to solve this problem:

1) If possible, VACUUM command should close its transaction id early in
the command processing, as it does not really need to be in transactions

2) the transaction of the VACUUM command could be assigned some really
bug trx id, so it won't get in a way

3) VACUUM commits/or aborts its transaction after some predetermined
interval (say 1 min) and starts a new one.

4) VACUUM itself could have some way to check if the "oldest"
transaction is also VACUUM, and be able to free the tuples that are
older than last *data altering* transaction. At least VACUUM, ANALYSE
and TRUNCATE should not be considered *data altering* here.

The problems caused by non-data-altering but long-running transactionsis
are not unique to VACUUM, similar problems also affects Slony. So any of
1)-3) would be preferable to 4)

Or perhaps it already solved in 8.0 ? My rant is about 7.4.6.

--
Hannu Krosing <hannu(at)tm(dot)ee>

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2005-02-07 01:51:18 Re: Thinking about breaking up the BufMgrLock
Previous Message Neil Conway 2005-02-07 01:35:48 Re: Thinking about breaking up the BufMgrLock