Re: Why frequently updated tables are an issue

From: Shridhar Daithankar <shridhar(at)frodo(dot)hserus(dot)net>
To: pgsql(at)mohawksoft(dot)com
Cc: Christopher Browne <cbbrowne(at)acm(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Why frequently updated tables are an issue
Date: 2004-06-10 14:37:58
Message-ID: 40C87246.9040208@frodo.hserus.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

pgsql(at)mohawksoft(dot)com wrote:
> The session table is a different issue, but has the same problems. You
> have an active website, hundreds or thousands of hits a second, and you
> want to manage sessions for this site. Sessions are created, updated many
> times, and deleted. Performance degrades steadily until a vacuum. Vacuum
> has to be run VERY frequently. Prior to lazy vacuum, this was impossible.
>
> Both session tables and summary tables have another thing in common, they
> are not vital data, they hold transitive state information. Yea, sure,
> data integrity is important, but if you lose these values, you can either
> recreate it or it isn't too important.
>
> Why put that is a database at all? Because, in the case of sessions
> especially, you need to access this information for other operations. In
> the case of summary tables, OLAP usually needs to join or include this
> info.
>
> PostgreSQL's behavior on these cases is poor. I don't think anyone who has
> tried to use PG for this sort of thing will disagree, and yes it is
> getting better. Does anyone else consider this to be a problem? If so, I'm
> open for suggestions on what can be done. I've suggested a number of
> things, and admittedly they have all been pretty weak ideas, but they were
> potentially workable.

There is another as-of-non-feasible and hence rejected approach. Vacuum in
postgresql is tied to entire relations/objects since indexes do not have
transaction visibility information.

It has been suggested in past to add such a visibility to index tuple header so
that index and heaps can be cleaned out of order. In such a case other backround
processes such as background writer and soon-to-be integrated autovacuum daemon
can vacuum pages/buffers rather than relations. That way most used things will
remain clean and cost of cleanup will remain outside crtical transaction
processing path.

However increasing index footprint seems to be a tough sell. Besides FSM would
need some rework to accomodate/autotune it's behaviour.

I am quoting from memory, so don't flame me if I misquote it. Just adding to
make this complete. Only from performance point of view, it could solve quite
some problems, at least in theory.

Shridhar

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message SZŰCS Gábor 2004-06-10 14:58:09 Re: simple_heap_update: tuple concurrently updated -- during INSERT
Previous Message James Robinson 2004-06-10 14:32:34 Re: Why frequently updated tables are an issue