From: | "Nick Fankhauser" <nickf(at)ontko(dot)com> |
---|---|
To: | "Pgsql-Performance(at)Postgresql(dot) Org" <pgsql-performance(at)postgresql(dot)org> |
Subject: | How to make n_distinct more accurate. |
Date: | 2003-09-22 20:42:27 |
Message-ID: | NEBBLAAHGLEEPCGOBHDGAEOIIIAA.nickf@ontko.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
Hi-
I have a table- called "event" with a field event_date_time that is indexed.
There are 1,700,000 rows in the table and 92,000 distinct values of
event_date_time with anywhere from 1 to 400 rows sharing the same value. (I
did a count grouped by event_date_time & scanned it to get this info.)
When I look at the pg_stats on this table, I always see 15,000 or lower in
the n_distinct column for event_date_time. (I re-ran analyze several times &
then checked pg_stats to see if the numbers varied significantly.)
Since this is off by about a factor of 6, I think the planner is missing the
chance to use this table as the "driver" in a complex query plan that I'm
trying to optimize.
So the question is- how can I get a better estimate of n_distinct from
analyze?
If I alter the stats target as high as it will go, I get closer, but it
still shows the index to be about 1/2 as selective as it actually is:
alpha=# alter table event alter column event_date_time set statistics 1000;
ALTER TABLE
alpha=# analyze event;
ANALYZE
alpha=# select n_distinct from pg_stats where tablename='event' and
attname='event_date_time';
n_distinct
------------
51741
(1 row)
This number seems to be consistently around 51,000 if I re-run analyze a few
times.
I guess my question is two-part:
(1)Is there any tweak to make this estimate work better?
(2)Since I'm getting numbers that are consistent but way off, is there a bug
here?
(2-1/2) Or alternately, am I totally missing what n-distinct is supposed to
denote?
Thanks!
-Nick
---------------------------------------------------------------------
Nick Fankhauser
nickf(at)doxpop(dot)com Phone 1.765.965.7363 Fax 1.765.962.9788
doxpop - Court records at your fingertips - http://www.doxpop.com/
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2003-09-23 13:59:01 | Re: restore time: sort_mem vs. checkpoing_segments |
Previous Message | Vivek Khera | 2003-09-22 20:17:54 | Re: restore time: sort_mem vs. checkpoing_segments |