Quick Links

Re: Improving count(*)

From:	Gavin Sherry <swm(at)linuxworld(dot)com(dot)au>
To:	Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc:	Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Improving count(*)
Date:	2005-11-18 00:51:23
Message-ID:	Pine.LNX.4.58.0511181149070.9614@linuxworld.com.au
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Fri, 18 Nov 2005, Simon Riggs wrote:

> >From here, another proposal. We have a GUC called count_uses_estimate
> that is set to off by default. If set to true, then a count(*) will use
> the planner logic to estimate number of rows in the table and return
> that as the answer, rather than actually count the row. Unless analyze
> statistics are not available, in which case it does the real count.

I'm finishing off a tablesample patch a grad student on #postgresql was
working on.

template1=# select count(*)*100 from a tablesample system(1) repeatable
(2);
?column?
----------
8371100
(1 row)

Time: 6366.757 ms
template1=# select count(*)*50 from a tablesample system(2) repeatable
(11);
?column?
----------
8453550
(1 row)

Time: 10521.871 ms
template1=# select count(*)*10 from a tablesample system(10) repeatable
(3);
?column?
----------
8314350
(1 row)

Time: 28744.498 ms
template1=# select count(*) from a;
count
---------
8388608
(1 row)

Time: 33897.857 ms

Seems like a better solution. I can finish the patch pretty soon. I need
to contact the original author, who has disappeared, but I'll send it over
to you.

Gavin

In response to

Re: Improving count(*) at 2005-11-18 00:08:03 from Simon Riggs

Responses

Re: Improving count(*) at 2005-11-18 07:41:01 from Simon Riggs

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Alvaro Herrera	2005-11-18 00:57:38	Re: CLUSTER and clustered indices
Previous Message	Joe Conway	2005-11-18 00:49:31	Re: Some array semantics issues