From: | Gregory Stark <stark(at)enterprisedb(dot)com> |
---|---|
To: | "Joshua Shanks" <jjshanks(at)gmail(dot)com> |
Cc: | "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Craig Ringer" <craig(at)postnewspapers(dot)com(dot)au>, <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: query planner not using the correct index |
Date: | 2008-08-07 22:38:07 |
Message-ID: | 874p5wsb74.fsf@oxford.xeocode.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
"Joshua Shanks" <jjshanks(at)gmail(dot)com> writes:
> Those 3 values in reality and in the stats account for 98% of the
> rows. actual distinct values are around 350
Measuring n_distinct from a sample is inherently difficult and unreliable.
When 98% of your table falls into those categories it's leaving very few
chances for the sample to find many other distinct values.
I haven't seen the whole thread, if you haven't tried already you could try
raising the statistics target for these columns -- that's usually necessary
anyways when you have a very skewed distribution like this.
> It seems like the planner would want to get the result set from
> bars.bars_id condition and if it is big using the index on the join to
> avoid the separate sorting, but if it is small (0-5 rows which is our
> normal case) use the primary key index to join and then just quickly
> sort. Is there any reason the planner doesn't do this?
Yeah, Heikki's suggested having a kind of "branch" plan node that knows how
where the break-point is between two plans and can call the appropriate one.
We don't have anything like that yet.
--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's On-Demand Production Tuning
From | Date | Subject | |
---|---|---|---|
Next Message | Joshua Shanks | 2008-08-07 23:04:58 | Re: query planner not using the correct index |
Previous Message | Scott Marlowe | 2008-08-07 22:12:58 | Re: file system and raid performance |