From: | Jan Wieck <JanWieck(at)Yahoo(dot)com> |
---|---|
To: | Shridhar Daithankar <shridhar(at)frodo(dot)hserus(dot)net> |
Cc: | pgsql(at)mohawksoft(dot)com, Christopher Browne <cbbrowne(at)acm(dot)org>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Why frequently updated tables are an issue |
Date: | 2004-06-12 18:54:23 |
Message-ID: | 40CB515F.3050303@Yahoo.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 6/10/2004 10:37 AM, Shridhar Daithankar wrote:
> pgsql(at)mohawksoft(dot)com wrote:
>> The session table is a different issue, but has the same problems. You
>> have an active website, hundreds or thousands of hits a second, and you
>> want to manage sessions for this site. Sessions are created, updated many
>> times, and deleted. Performance degrades steadily until a vacuum. Vacuum
>> has to be run VERY frequently. Prior to lazy vacuum, this was impossible.
>>
>> Both session tables and summary tables have another thing in common, they
>> are not vital data, they hold transitive state information. Yea, sure,
>> data integrity is important, but if you lose these values, you can either
>> recreate it or it isn't too important.
>>
>> Why put that is a database at all? Because, in the case of sessions
>> especially, you need to access this information for other operations. In
>> the case of summary tables, OLAP usually needs to join or include this
>> info.
>>
>> PostgreSQL's behavior on these cases is poor. I don't think anyone who has
>> tried to use PG for this sort of thing will disagree, and yes it is
>> getting better. Does anyone else consider this to be a problem? If so, I'm
>> open for suggestions on what can be done. I've suggested a number of
>> things, and admittedly they have all been pretty weak ideas, but they were
>> potentially workable.
>
> There is another as-of-non-feasible and hence rejected approach. Vacuum in
> postgresql is tied to entire relations/objects since indexes do not have
> transaction visibility information.
>
> It has been suggested in past to add such a visibility to index tuple header so
> that index and heaps can be cleaned out of order. In such a case other backround
> processes such as background writer and soon-to-be integrated autovacuum daemon
> can vacuum pages/buffers rather than relations. That way most used things will
> remain clean and cost of cleanup will remain outside crtical transaction
> processing path.
This is not feasable because at the time you update or delete a row you
would have to visit all it's index entries. The performance impact on
that would be immense.
But a per relation bitmap that tells if a block is a) free of dead
tuples and b) all remaining tuples in it are frozen could be used to let
vacuum skip them (there can't be anything to do). The bit would get
reset whenever the block is marked dirty. This would cause vacuum to
look at mainly recently touched blocks, likely to be found in the buffer
cache anyway and thus dramatically reduce the amount of IO and thereby
make high frequent vacuuming less expensive.
Jan
--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck(at)Yahoo(dot)com #
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2004-06-12 19:45:47 | Re: Why frequently updated tables are an issue |
Previous Message | Bruce Momjian | 2004-06-12 18:38:29 | Re: File leak? |