From: | Josh Berkus <josh(at)agliodbs(dot)com> |
---|---|
To: | "pgsql-perform" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Bad n_distinct estimation; hacks suggested? |
Date: | 2005-04-19 19:09:05 |
Message-ID: | 200504191209.05181.josh@agliodbs.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-performance |
Folks,
Params: PostgreSQL 8.0.1 on Solaris 10
Statistics = 500
(tablenames have been changed to protect NDA)
e1=# select tablename, null_frac, correlation, n_distinct from pg_stats where
tablename = 'clickstream1' andattname = 'session_id';
tablename | null_frac | correlation | n_distinct
----------------------+-----------+-------------+------------
clickstream1 | 0 | 0.412034 | 378174
(2 rows)
e1=# select count(distinct session_id) from clickstream1;
count
---------
3174813
As you can see, n_distinct estimation is off by a factor of 10x and it's
causing query planning problems. Any suggested hacks to improve the
histogram on this?
(BTW, increasing the stats to 1000 only doubles n_distinct, and doesn't solve
the problem)
--
--Josh
Josh Berkus
Aglio Database Solutions
San Francisco
From | Date | Subject | |
---|---|---|---|
Next Message | Matthew T. O'Connor | 2005-04-19 19:44:15 | Re: Win32 presentation |
Previous Message | Victor Y. Yegorov | 2005-04-19 19:03:19 | Re: Comparing Datum's at aminsert() stage |
From | Date | Subject | |
---|---|---|---|
Next Message | Dave Held | 2005-04-19 20:01:57 | Re: Bad n_distinct estimation; hacks suggested? |
Previous Message | Tom Lane | 2005-04-19 18:34:35 | Re: Question on REINDEX |