Re: count(*) using index scan in "query often, update rarely" environment

From: "Merlin Moncure" <merlin(dot)moncure(at)rcsonline(dot)com>
To: "hubert depesz lubaczewski" <depesz(at)gmail(dot)com>
Cc: <pgsql-performance(at)postgresql(dot)org>
Subject: Re: count(*) using index scan in "query often, update rarely" environment
Date: 2005-10-07 13:50:30
Message-ID: 6EE64EF3AB31D5448D0007DD34EEB3417DD541@Herge.rcsinc.local
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On 10/7/05, Cestmir Hybl <cestmirl(at)freeside(dot)sk> wrote:
Isn't it possible (and reasonable) for these environments to keep track
of whether there is a transaction in progress with update to given table
and if not, use an index scan (count(*) where) or cached value
(count(*)) to perform this kind of query?
________________________________________

The answer to the first question is subtle. Basically, the PostgreSQL
engine is designed for high concurrency. We are definitely on the right
side of the cost/benefit tradeoff here. SQL server does not have MVCC
(or at least until 2005 appears) so they are on the other side of the
tradeoff.

You can of course serialize the access yourself by materializing the
count in a small table and use triggers or cleverly designed
transactions. This is trickier than it might look however so check the
archives for a thorough treatment of the topic.

One interesting thing is that making count(*) over large swaths of data
is frequently an indicator of a poorly normalized database. Is it
possible to optimize the counting by laying out your data in a different
way?

Merlin

Browse pgsql-performance by date

  From Date Subject
Next Message Richard Huxton 2005-10-07 14:11:01 Re: count(*) using index scan in "query often, update rarely"
Previous Message Tom Lane 2005-10-07 13:42:32 Re: count(*) using index scan in "query often, update rarely" environment