Re: Make COUNT(*) Faster?

From: Steve Wampler <swampler(at)noao(dot)edu>
To: Christopher Browne <cbbrowne(at)acm(dot)org>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: Make COUNT(*) Faster?
Date: 2005-07-08 17:08:52
Message-ID: 42CEB324.7040101@noao.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Christopher Browne wrote:
>>I'm also claiming that a true count for any active table is
>>meaningless and am *not* suggesting that effort be spent on trying
>>to produce such a true count.
>
>
> That's a pretty big assumption that would in fact be WRONG.

Please reread the message from Bruno and reconcile the above statement
with his assertion (which I believe) that there is *no* single true count
for an active table. [I'm defining 'active' as currently
undergoing insert/copy/delete/update actions].

> We have managers interested in counting the number of objects we have
> around (As a domain registry, what objects would you imagine those
> might be :-)), and they're keen on possibly even being able to
> reconcile those counts from day to day based on transaction activity.

If Bruno is correct, then they need to do this reconcilation from
within a single transaction (the same one that does the COUNT(*)) - or
else they are working on an 'inactive' table [one not currently
accepting changes]. If neither condition holds, then isn't the result
they are using from COUNT(*) currently is *already* an approximation?

> Leaping into some sort of vague guesstimation would destroy the
> ability to do any kind of analysis of activity, and I daresay enrage
> them.

No doubt! Let's hope the above conditions hold.

> There may be times that a really rough guess can suffice; there are
> other times when exactness is absolutely vital.

But, as others have said, COUNT(*) does not return a true count for
a table, but rather just a true count for the *current transaction*.
So COUNT(*)'s from different simultaneous transactions may very well
produce different values.

> Creating a "fast but WRONG COUNT(*)" which prevented getting the exact
> answer that the present implementation provides would be a severe
> misfeature.

Agreed - note that I did not suggest replacing the current COUNT(*)
with an inexact version, but wanted (and now have) a quick way to get
a reasonable approximation of the current table size.

--
Steve Wampler -- swampler(at)noao(dot)edu
The gods that smiled on your birth are now laughing out loud.

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Ying Lu 2005-07-08 17:08:53 Re: Index creation question for expression (col1 || '-' ||
Previous Message Bruno Wolff III 2005-07-08 16:57:14 Re: Index creation question for expression (col1 || '-' || col2)