a question about row estimation in postgres

From: Reynold Xin <rxin(at)cs(dot)berkeley(dot)edu>
To: pgsql-general(at)postgresql(dot)org
Subject: a question about row estimation in postgres
Date: 2011-03-20 07:20:55
Message-ID: AANLkTimnoa4mS1SQN=gW1TYzRDwbbsghApMZLz4uMQUd@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

I have a rankings table and it has 1302 rows in total. I am a bit confused
by how postgres (8.2.11) calculates the cardinality for this rankings table
based on < predicates on gradrate attribute.

select histogram_bounds from pg_stats where attname = 'gradrate' and
tablename = 'rankings';
histogram_bounds
------------------------------------
{8,33,40,46,55,61,69,75,81,90,118}

explain SELECT * FROM rankings WHERE gradrate < 11;
QUERY PLAN
---------------------------------------------------------------------------------
Index Scan using gradrate_idx on rankings (cost=0.00..44.24
*rows=11* width=196)
Index Cond: (gradrate < 11::double precision)
(2 rows)

explain select * from rankings where gradrate < 10;
QUERY PLAN

--------------------------------------------------------------------------------
Index Scan using gradrate_idx on rankings (cost=0.00..32.24 *rows=7*
width=196)
Index Cond: (gradrate < 10::double precision)
(2 rows)

Following the formula outlined in
http://www.postgresql.org/docs/8.3/static/row-estimation-examples.html

Both gradrate 10 and gradrate 11 would fall in the first bucket.

Shouldn't the row estimation be:

(11 - 8) / (33 - 8) / 10 * 1302 = 15.624
and
(10 - 8) / (33 - 8) / 10 * 1302 = 10.416

instead of 11 and 7?

Perhaps I am missing something. I'd appreciate if you can point it out.
Thanks!

--
Reynold Xin

Responses

Browse pgsql-general by date

  From Date Subject
Next Message srikanth subramaniam 2011-03-20 10:06:14 cannot connect to server
Previous Message Gurjeet Singh 2011-03-20 07:16:51 Re: How do you stop the log from rotating on restart?