Re: Make COUNT(*) Faster?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Chris Browne <cbbrowne(at)acm(dot)org>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: Make COUNT(*) Faster?
Date: 2005-07-08 04:27:23
Message-ID: 8508.1120796843@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Chris Browne <cbbrowne(at)acm(dot)org> writes:
> vmehta(at)apple(dot)com (Varun Mehta) writes:
>> If I run an EXPLAIN on this query I can see that it is doing a
>> sequential scan, which seems quite needless, as surely this
>> information is cached in some secret location.

> [ example scenario snipped ]
> If there were some "single secret place" with a count, how would you
> suggest it address those 78 tuples and 16 transactions that aren't yet
> (and maybe never will be) part of the count?

It's worse than that: once some of those transactions have committed,
the right answer is observer-dependent, since some onlooker transactions
may see those guys as committed while others think they are not yet
committed. So there could certainly not be just one secret place...

There are solutions suggested in the archives, but they all amount
to making COUNT(*)-with-no-WHERE-or-GROUP-BY-clause fast at the price
of nontrivial distributed overhead for all updates --- overhead that
would be paid whether or not the application ever did such a COUNT.
That's not a tradeoff we've wanted to make in general. You can
implement it yourself via triggers for specific tables that you think
it's worth doing for.

Also, if an approximate answer is good enough, there are a whole other
set of possible solutions.

regards, tom lane

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Kenneth Gonsalves 2005-07-08 05:48:06 two sums in one query
Previous Message Chris Browne 2005-07-08 03:39:21 Re: Make COUNT(*) Faster?