From: | Christopher Browne <cbbrowne(at)acm(dot)org> |
---|---|
To: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: vacuum, performance, and MVCC |
Date: | 2006-06-24 21:57:20 |
Message-ID: | 87ejxeb4jj.fsf@wolfe.cbbrowne.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Martha Stewart called it a Good Thing when JanWieck(at)Yahoo(dot)com (Jan Wieck) wrote:
> On 6/22/2006 2:37 PM, Alvaro Herrera wrote:
>
>> Adding back pgsql-hackers.
>> Mark Woodward wrote:
>>> > Mark Woodward wrote:
>>> >
>>> >> Hmm, OK, then the problem is more serious than I suspected.
>>> >> This means that every index on a row has to be updated on every
>>> >> transaction that modifies that row. Is that correct?
>>> >
>>> > Add an index entry, yes.
>>> >
>>> >> I am attaching some code that shows the problem with regard to
>>> >> applications such as web server session management, when run, each
>>> >> second
>>> >> the system can handle fewer and fewer connections. Here is a brief
>>> >> output:
>>> >> [...]
>>> >> There has to be a more linear way of handling this scenario.
>>> >
>>> > So vacuum the table often.
>>> That fixes the symptom, not the problem. The problem is performance
>>> steadily degrades over time.
>> No, you got it backwards. The performance degradation is the
>> symptom.
>> The problem is that there are too many dead tuples in the table. There
>> is one way to solve that problem -- remove them, which is done by
>> running vacuum.
>
> Precisely.
>
>> There are some problems with vacuum itself, that I agree with. For
>> example it would be good if a long-running vacuum wouldn't affect a
>> vacuum running in another table because of the long-running transaction
>> effect it has. It would be good if vacuum could be run partially over a
>> table. It would be good if there was a way to speed up vacuum by using
>> a dead space map or something.
>
> It would be good if vacuum wouldn't waste time on blocks that don't
> have any possible work in them. Vacuum has two main purposes. A)
> remove dead rows and B) freeze xids. Once a block has zero deleted
> rows and all xids are frozen, there is nothing to do with this block
> and vacuum should skip it until a transaction updates that block.
>
> This requires 2 bits per block, which is 32K per 1G segment of a
> heap. Clearing the bits is done when the block is marked dirty. This
> way vacuum would not waste any time and IO on huge slow changing
> tables. That part, sequentially scanning huge tables that didn't
> change much is what keeps us from running vacuum every couple of
> seconds.
This is, in effect, the "VACUUM Space Map."
I see one unfortunate thing about that representation of it, namely
that it would in effect require that non-frozen pages be kept on the
VSM for potentially a long time.
Based on *present* VACUUM strategy, at least.
Would it not be the case, here, that any time a page could be
"frozen," it would have to be? In effect, we are always trying to run
VACUUM FREEZE?
--
output = ("cbbrowne" "@" "gmail.com")
http://cbbrowne.com/info/finances.html
Rules of the Evil Overlord #72. "If all the heroes are standing
together around a strange device and begin to taunt me, I will pull
out a conventional weapon instead of using my unstoppable superweapon
on them. <http://www.eviloverlord.com/>
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2006-06-24 22:05:45 | Re: Buffer for inner and outer table |
Previous Message | Daniel Xavier de Sousa | 2006-06-24 21:41:11 | Re: Buffer for inner and outer table |