Re: Improving count(*)

From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: "Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Improving count(*)
Date: 2005-11-17 22:11:09
Message-ID: 20051117221109.GL22933@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 17, 2005 at 09:34:08PM +0000, Simon Riggs wrote:
> Adding visibility to an index would add substantial bulk to any index.
> If we could do this at the same time as adding leading key, full field
> compression (*not* prefix compression), then it might be worth doing.

I think the single biggest problem with visibility-in-index is that
there is no link from the tuple to the index. So if you update a tuple,
the only way to update the index is the start from the top and go down
until you find it. If your table/index is of any size, you can imagine
the overhead will kill you.

Now, lets say you add a field to the tuple which you the position of
the index entry. You can only reasonably do this for one index, say the
primary key. Now you have a two-way link the updating becomes much
quicker, at the cost of even more overhead.

Doing it only for one index per table may be sensible anyway since you
don't really want to store visibility any more times than necessary.

> I would also note that DELETE would need to touch all visible index
> rows, which currently is not required for btree indexes. (But as we
> already noted, any solution must include visibility data and so any
> solution must update some data structure on delete).

Remember, UPDATE = DELETE + UPDATE, so you have to handle all updates
too. Inserts are the only easy case (well, except the fact that they
have to point to eachother. locking nastyness).

> Index-only plans could help with various GROUP BY and join queries also,
> so it certainly is attractive, though costly.

Only in cases where you don't need the data (ie EXISTS), otherwise you
still need the tuple.

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
> tool for doing 5% of the work and then sitting around waiting for someone
> else to do the other 95% so you can sue them.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Philip Yarra 2005-11-17 22:25:04 Re: Optional postgres database not so optional in 8.1
Previous Message Simon Riggs 2005-11-17 21:48:43 Re: CLUSTER and clustered indices