From: | Gavin Sherry <swm(at)linuxworld(dot)com(dot)au> |
---|---|
To: | Simon Riggs <simon(at)2ndquadrant(dot)com> |
Cc: | Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, tgl(at)sss(dot)pgh(dot)pa(dot)us, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Improving count(*) |
Date: | 2005-11-18 00:51:23 |
Message-ID: | Pine.LNX.4.58.0511181149070.9614@linuxworld.com.au |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, 18 Nov 2005, Simon Riggs wrote:
> >From here, another proposal. We have a GUC called count_uses_estimate
> that is set to off by default. If set to true, then a count(*) will use
> the planner logic to estimate number of rows in the table and return
> that as the answer, rather than actually count the row. Unless analyze
> statistics are not available, in which case it does the real count.
I'm finishing off a tablesample patch a grad student on #postgresql was
working on.
template1=# select count(*)*100 from a tablesample system(1) repeatable
(2);
?column?
----------
8371100
(1 row)
Time: 6366.757 ms
template1=# select count(*)*50 from a tablesample system(2) repeatable
(11);
?column?
----------
8453550
(1 row)
Time: 10521.871 ms
template1=# select count(*)*10 from a tablesample system(10) repeatable
(3);
?column?
----------
8314350
(1 row)
Time: 28744.498 ms
template1=# select count(*) from a;
count
---------
8388608
(1 row)
Time: 33897.857 ms
Seems like a better solution. I can finish the patch pretty soon. I need
to contact the original author, who has disappeared, but I'll send it over
to you.
Gavin
From | Date | Subject | |
---|---|---|---|
Next Message | Alvaro Herrera | 2005-11-18 00:57:38 | Re: CLUSTER and clustered indices |
Previous Message | Joe Conway | 2005-11-18 00:49:31 | Re: Some array semantics issues |