Weird, bad 0.5% selectivity estimate for a column equal to itself

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Weird, bad 0.5% selectivity estimate for a column equal to itself
Date: 2013-06-21 19:52:05
Message-ID: 51C4AEE5.8040704@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Folks,

I'm getting something really odd in 9.2.4, where the planner estimates
that the selectivity of a column equal to itself is always exactly 0.5%
(i.e. 0.005X). I can't figure out where this constant is coming from,
or why it's being applied.

Test case:

create table esttest (
id int not null primary key,
state1 int not null default 0,
state2 int not null default 0,
state3 int not null default 0
);

insert into esttest (id, state1, state2, state3)
select i,
(random()*3)::int,
(random())::int,
(random()*100)::int
from generate_series (1, 20000)
as gs(i);

vacuum analyze esttest;

explain analyze
select * from esttest
where state1 = state1;

explain analyze
select * from esttest
where state2 = state2;

explain analyze
select * from esttest
where state3 = state3;

Results of test case:

badestimate=# explain analyze
badestimate-# select * from esttest
badestimate-# where state1 = state1;
QUERY PLAN
----------------------------------------------------------------------------------------------------------
Seq Scan on esttest (cost=0.00..359.00 rows=100 width=16) (actual
time=0.009..4.145 rows=20000 loops=1)
Filter: (state1 = state1)
Total runtime: 5.572 ms
(3 rows)

badestimate=#
badestimate=# explain analyze
badestimate-# select * from esttest
badestimate-# where state2 = state2;
QUERY PLAN
----------------------------------------------------------------------------------------------------------
Seq Scan on esttest (cost=0.00..359.00 rows=100 width=16) (actual
time=0.006..4.166 rows=20000 loops=1)
Filter: (state2 = state2)
Total runtime: 5.595 ms
(3 rows)

badestimate=#
badestimate=# explain analyze
badestimate-# select * from esttest
badestimate-# where state3 = state3;
QUERY PLAN
----------------------------------------------------------------------------------------------------------
Seq Scan on esttest (cost=0.00..359.00 rows=100 width=16) (actual
time=0.005..4.298 rows=20000 loops=1)
Filter: (state3 = state3)
Total runtime: 5.716 ms
(3 rows)

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2013-06-21 21:32:58 Re: Weird, bad 0.5% selectivity estimate for a column equal to itself
Previous Message Maciek Sakrejda 2013-06-21 17:08:50 Re: Query tuning: partitioning, DISTINCT ON, and indexing