From: | Steve Wampler <swampler(at)noao(dot)edu> |
---|---|
To: | Bruno Wolff III <bruno(at)wolff(dot)to> |
Cc: | pgsql-sql(at)postgresql(dot)org |
Subject: | Re: Make COUNT(*) Faster? |
Date: | 2005-07-08 15:07:27 |
Message-ID: | 42CE96AF.3020903@noao.edu |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
Bruno Wolff III wrote:
> No, it is an exact count.
Yes, for the transaction, but it's an approximation of the number of
tuples in the table - which is probably what the people who worry about
its cost are more interested in (an approximate count for the table).
I'm also claiming that a true count for any active table is meaningless and
am *not* suggesting that effort be spent on trying to produce such
a true count.
>>I assume this has been beaten well past death, but I don't see why it
>>wouldn't be possible to keep pg_class.reltuples a bit more up-to-date
>>instead of updating it only on vacuums.
>
>
> Because it costs resources to keep track of that and people don't usually need
> exact tuple counts for whole tables.
Yes, we agree completely! (Which is why I said 'a bit more' instead of
'exactly' above.) My uses for COUNT(*) are to get 'reasonable' approximate
counts of the table sizes - not true counts, but approximate values. Unfortunately,
pg_class.reltuples gets too far off too fast for me to use it as a consistent guide
to current table size. If you Folks Who Know believe that simply keeping
pg_class.reltuples 'closer' to the actual table size is too expensive, I'll
accept that [after all, I have to right now anyway], but I'm surprised that
it is, given all the other work that must go on at the start/close of a transaction.
I also understand that 'reasonable' and 'closer' are vague terms.
In the example scenerio where there were around 80 rows in an indeterminate
state, my claim is that, in a table of around a million rows, it doesn't
matter whether some portion of those indeterminate rows are included in
an approximation of the table size or not (though it might in a table of
100 'true' rows - but the decision to ask for a true 'transaction' count (slow)
or an approximate table size (fast) should be left to the user in either case).
So, leave COUNT(*) alone. But it would be very handy to have a
way to get an approximate table size that is more accurate than is
provided by a pg_class.reltuples that is only updated on vacuums.
--
Steve Wampler -- swampler(at)noao(dot)edu
The gods that smiled on your birth are now laughing out loud.
From | Date | Subject | |
---|---|---|---|
Next Message | Rod Taylor | 2005-07-08 15:22:30 | Re: Make COUNT(*) Faster? |
Previous Message | PFC | 2005-07-08 14:49:44 | Re: two sums in one query |