Re: Why isn't an index being used when selecting a distinct value?

From: Keaton Adams <kadams(at)mxlogic(dot)com>
To: Gregory Stark <stark(at)enterprisedb(dot)com>
Cc: <pgsql-general(at)postgresql(dot)org>
Subject: Re: Why isn't an index being used when selecting a distinct value?
Date: 2008-02-18 05:22:33
Message-ID: C3DE6429.2DFF%kadams@mxlogic.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

The GROUP BY was the fastest method.

Thanks for the suggestions,

Keaton

On 2/15/08 3:12 PM, "Gregory Stark" <stark(at)enterprisedb(dot)com> wrote:

> "Keaton Adams" <kadams(at)mxlogic(dot)com> writes:
>
>> Version: Postgres 8.1.4
>> Platform: RHEL
>>
>> Given this scenario with the indexes in place, when I ask for the distinct
>> field1_id values, why does the optimizer choose a sequential scan instead of
>> just reading from the kda_log_fid_cre_20080123_idx index? The time it takes
>> to perform the sequential scan against 20+ million records is way too slow.
>
> Try (temporarily) doing:
>
> SET enable_seqscan = off;
>
>> keaton=# explain select distinct field1_id into temp kda_temp from
>> kda_log_20080213;
>
> If the database is right that will be even slower. Using a full index scan
> requires a lot of random access seeks, generally the larger the table the
> *more* likely a sequential scan and sort is a better approach than using an
> index.
>
> If it's wrong and it's faster then you have to consider whether it's only
> faster because you've read the table into cache already. Will it be in cache
> in production? If so then you migth try raising effective_cache_size or
> lowering random_page_cost.
>
> Another thing to try is using GROUP BY instead of DISTINCT. This is one case
> where the postgres optimizer doesn't handle the two equivalent cases in
> exactly the same way and there are some plans available in one method that
> aren't in the other. That's only likely to help if you have relative few
> values of field1_id but it's worth trying.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tim Hart 2008-02-18 05:43:13 Re: Question about the enum type
Previous Message Tom Lane 2008-02-18 05:21:06 Re: Question about the enum type