Quick Links

Re: Simple query not using index: why?

From:	William Garrison <postgres(at)mobydisk(dot)com>
To:	Adrian Klaver <aklaver(at)comcast(dot)net>
Cc:	Postgres General List <pgsql-general(at)postgresql(dot)org>
Subject:	Re: Simple query not using index: why?
Date:	2008-09-03 19:55:17
Message-ID:	48BEEBA5.3030108@mobydisk.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Can't it just scan the index to get that? I assumed the index had links
to every fileid in the table. In my over-simplified imagination, the
table looks like this:

While the index looks like
fileid|ctid
fileid|ctid
fileid|ctid
fileid|ctid
...

So I expected scanning the index was faster, and still had everything it
needed to do the count. Or perhaps it was because I said COUNT(*) so it
needs to look at the other columns in the table? I really just wanted
the number of "hits" not the number of records with distinct values or
anything like that. My understanding was that COUNT(*) did that, and
didn't really look at the columns themselves.

Adrian Klaver wrote:
> -------------- Original message ----------------------
> From: William Garrison <postgres(at)mobydisk(dot)com>
>
>> I am looking for records with duplicate keys, so I am running this query:
>>
>> SELECT
>> fileid, COUNT(*)
>> FROM
>> file
>> GROUP BY
>> fileid
>> HAVING
>> COUNT(*)>1
>>
>> The table has an index on fileid (non-unique index) so I am surprised
>> that postgres is doing a table scan. This database is >15GB, and there
>> are a number of fairly large string columns in the table. I am very
>> surprised that scanning the index is not faster than scanning the
>> table. Any thoughts on that? Is scanning the table faster than
>> scanning the index? Is there a reason that it needs anything other than
>> the index?
>>
>>
>
> I may be missing something, but it would have to scan the entire table to get all the occurrences of each fileid in order to do the count(*).
>
>
>
> --
> Adrian Klaver
> aklaver(at)comcast(dot)net
>
>
>

In response to

Re: Simple query not using index: why? at 2008-09-03 17:47:30 from Adrian Klaver

Responses

Re: Simple query not using index: why? at 2008-09-03 20:01:30 from Joshua Drake

Browse pgsql-general by date

	From	Date	Subject
Next Message	Joshua Drake	2008-09-03 20:01:30	Re: Simple query not using index: why?
Previous Message	Tony Caduto	2008-09-03 19:54:52	Re: SELECT INTO returns incorrect values