Quick Links

Re: Make COUNT(*) Faster?

From:	Steve Wampler <swampler(at)noao(dot)edu>
To:	Bruno Wolff III <bruno(at)wolff(dot)to>
Cc:	pgsql-sql(at)postgresql(dot)org
Subject:	Re: Make COUNT(*) Faster?
Date:	2005-07-08 15:07:27
Message-ID:	42CE96AF.3020903@noao.edu
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-sql

Bruno Wolff III wrote:
> No, it is an exact count.

Yes, for the transaction, but it's an approximation of the number of
tuples in the table - which is probably what the people who worry about
its cost are more interested in (an approximate count for the table).
I'm also claiming that a true count for any active table is meaningless and
am *not* suggesting that effort be spent on trying to produce such
a true count.

>>I assume this has been beaten well past death, but I don't see why it
>>wouldn't be possible to keep pg_class.reltuples a bit more up-to-date
>>instead of updating it only on vacuums.
>
>
> Because it costs resources to keep track of that and people don't usually need
> exact tuple counts for whole tables.

Yes, we agree completely! (Which is why I said 'a bit more' instead of
'exactly' above.) My uses for COUNT(*) are to get 'reasonable' approximate
counts of the table sizes - not true counts, but approximate values. Unfortunately,
pg_class.reltuples gets too far off too fast for me to use it as a consistent guide
to current table size. If you Folks Who Know believe that simply keeping
pg_class.reltuples 'closer' to the actual table size is too expensive, I'll
accept that [after all, I have to right now anyway], but I'm surprised that
it is, given all the other work that must go on at the start/close of a transaction.

I also understand that 'reasonable' and 'closer' are vague terms.
In the example scenerio where there were around 80 rows in an indeterminate
state, my claim is that, in a table of around a million rows, it doesn't
matter whether some portion of those indeterminate rows are included in
an approximation of the table size or not (though it might in a table of
100 'true' rows - but the decision to ask for a true 'transaction' count (slow)
or an approximate table size (fast) should be left to the user in either case).

So, leave COUNT(*) alone. But it would be very handy to have a
way to get an approximate table size that is more accurate than is
provided by a pg_class.reltuples that is only updated on vacuums.

--
Steve Wampler -- swampler(at)noao(dot)edu
The gods that smiled on your birth are now laughing out loud.

In response to

Re: Make COUNT(*) Faster? at 2005-07-08 14:35:08 from Bruno Wolff III

Responses

Re: Make COUNT(*) Faster? at 2005-07-08 15:22:30 from Rod Taylor
Re: Make COUNT(*) Faster? at 2005-07-08 15:31:00 from Andrew Sullivan
Re: Make COUNT(*) Faster? at 2005-07-08 15:35:00 from Tom Lane
Re: Make COUNT(*) Faster? at 2005-07-08 17:09:29 from Christopher Browne

Browse pgsql-sql by date

	From	Date	Subject
Next Message	Rod Taylor	2005-07-08 15:22:30	Re: Make COUNT(*) Faster?
Previous Message	PFC	2005-07-08 14:49:44	Re: two sums in one query