Re: performance of count(*)

From: Scott Ribe <scott_ribe(at)elevated-dev(dot)com>
To: Tomas Vondra <tv(at)fuzzy(dot)cz>, Andrew Sullivan <ajs(at)crankycanuck(dot)ca>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: performance of count(*)
Date: 2011-05-06 21:43:02
Message-ID: D0947951-92B5-42F1-A8BF-297A0A6F2ED0@elevated-dev.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On May 6, 2011, at 1:39 PM, Tomas Vondra wrote:

> Anyway I'd recommend to start with the eager approach, it's much easier
> to implement. You can implement the lazy approach later, if you find out
> it's needed.

With the eager approach, I think I'm too likely to get write conflicts. Thanks for the reference to the paper, I believe that's what I was looking for.

> And you should strive to use HOT feature (if you're on >= 8.4),
> especially with the eager approach - it often does a lot of updates and
> leads to bloat of the aggregated table. So decrease the fillfactor and
> do not index the columns that are updated by the triggers.

See, that's the kind of info I'm looking for ;-)

On May 6, 2011, at 1:59 PM, Andrew Sullivan wrote:

> If the WHERE clause is fairly selective and indexed, that should be
> fast. Not as fast as estimates based on trigger-written values in
> another table, of course, but reasonably fast. So the first order of
> business is usually to find or create indexes that will make SELECT on
> the same criteria fast.

In this case, it depends on the result of a pretty complex join that involves some gnarly time calculations, and finding the unmatched rows from one side of an outer join. I really don't think there's a way to optimize the straight-up query to be faster than it is, I looked at that for a good long time, explain/analyze and all. Postgres is using the appropriate index to narrow things down as much as it can at the very beginning, it just then has to perform a heck of a lot of work to finish the join... And it's not taking ***that*** long--it's just that I want it faster!

> It's only unqualified "SELECT count(*)" that is slow. Generally, the
> system table is good enough for that, I find. (Someone: "How long
> will this take?" Me: "There are about 400 million rows to go
> through." Even if you're off by 50 million at that point, it doesn't
> matter.)

FYI, I have no need for unqualified select count(*) in this app--just doesn't happen, ever ;-)

Thanks.

--
Scott Ribe
scott_ribe(at)elevated-dev(dot)com
http://www.elevated-dev.com/
(303) 722-0567 voice

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Andrew Sullivan 2011-05-06 22:15:38 Re: performance of count(*)
Previous Message David Johnston 2011-05-06 21:17:06 Re: Foreign key in composite values