Re: Why count(*) doest use index?

From: Glenn Maynard <glenn(at)zewt(dot)org>
To: pgsql-general(at)postgresql(dot)org
Subject: Re: Why count(*) doest use index?
Date: 2011-03-07 22:26:56
Message-ID: AANLkTimWYMfRa1vBsw4N58bYJXCbDqd3gOWG-hyqer=O@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, Mar 7, 2011 at 4:35 PM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:

> SELECT COUNT(*) FROM table WHERE expr;
>
> will use index (assuming expr is optimizable and is worth while to
> optimize). Your case might be interesting for cache purposes if expr2
> is expensive, but has nothing to do with postgres index usage via
> count(*). mysql/myisam needs to scan as well in this case -- it
> can't magically 'look up' the value as it can for the in filtered
> (very special) case... it only differs from pg in that it can skip
> heap visibility check because all records are known good (and pg is
> moving towards optimizing this case in mostly read only workloads!)
>

It'll do an index scan, but it's still a scan--linear time over the size of
the set. That's too expensive for many cases.

My particular case is something like this:

SELECT COUNT(*), event_time::date FROM events
WHERE event_time::date >= '2011-01-01' AND event_time::date < '2011-02-01'
AND user=50
GROUP BY event_time::date;

An index on "events(user, event_time::date)" could optimize this, eg.
effectively maintaining a count of matching rows for each (user, day)
tuple--which is ultimately what I'm doing manually with triggers. Of
course, it would have a significant cost, in some combination of complexity,
index size and write concurrency, and couldn't be the default behavior for
an index.

--
Glenn Maynard

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Aleksey Tsalolikhin 2011-03-07 22:45:00 Re: database is bigger after dump/restore - why? (60 GB to 109 GB)
Previous Message Michael Black 2011-03-07 22:00:16 Re: First production install - general advice