Why is PostgreSQL 9.1 not using index for simple equality select

From: Yang Zhang <yanghatespam(at)gmail(dot)com>
To: pgsql-general(at)postgresql(dot)org
Subject: Why is PostgreSQL 9.1 not using index for simple equality select
Date: 2013-04-12 08:03:34
Message-ID: CAKxBDU8u8sOWy-hSoM7YCR-AhzbN+cYFpS0oc-s0yJbXg2uG_Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Any hints with this question I had posted to SO?

http://stackoverflow.com/questions/15965785/why-is-postgresql-9-1-not-using-index-for-simple-equality-select

Pasted here as well. Thanks.

My table `lead` has an index:

\d lead
...
Indexes:
"lead_pkey" PRIMARY KEY, btree (id)
"lead_account__c" btree (account__c)
...
"lead_email" btree (email)
"lead_id_prefix" btree (id text_pattern_ops)

Why doesn't PG (9.1) use the index for this straightforward equality
selection? Emails are almost all unique....

db=> explain select * from lead where email = 'blah';
QUERY PLAN
------------------------------------------------------------
Seq Scan on lead (cost=0.00..319599.38 rows=1 width=5108)
Filter: (email = 'blah'::text)
(2 rows)

Other index-hitting queries seem to be OK (though I don't know why
this one doesn't just use the pkey index):

db=> explain select * from lead where id = '';
QUERY PLAN
------------------------------------------------------------------------------
Index Scan using lead_id_prefix on lead (cost=0.00..8.57 rows=1
width=5108)
Index Cond: (id = ''::text)
(2 rows)

db=> explain select * from lead where account__c = '';
QUERY PLAN
----------------------------------------------------------------------------------
Index Scan using lead_account__c on lead (cost=0.00..201.05
rows=49 width=5108)
Index Cond: (account__c = ''::text)
(2 rows)

At first I thought it may be due to not enough distinct values of
`email`. For instance, if the stats claim that `email` is `blah` for
most of the table, then a seq scan is faster. But that's not the
case:

db=> select count(*), count(distinct email) from lead;
count | count
--------+--------
749148 | 733416
(1 row)

Even if I force seq scans to be off, the planner behaves as if it has
no other choice:

db=> set enable_seqscan = off;
SET
db=> show enable_seqscan;
enable_seqscan
----------------
off
(1 row)

db=> explain select * from lead where email = 'foo(at)blah(dot)com';
QUERY PLAN
---------------------------------------------------------------------------
Seq Scan on lead (cost=10000000000.00..10000319599.38 rows=1 width=5108)
Filter: (email = 'foo(at)blah(dot)com'::text)
(2 rows)

I searched over a good number of past SO questions but none were about
a simple equality query like this one.

Responses

Browse pgsql-general by date

  From Date Subject
Next Message John R Pierce 2013-04-12 08:13:59 Re: Why is PostgreSQL 9.1 not using index for simple equality select
Previous Message John R Pierce 2013-04-12 07:59:54 Re: How to convert US date format to European date format ?