From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Kate Collins <klcollins(at)wsicorp(dot)com> |
Cc: | postgresql news group <pgsql-sql(at)postgresql(dot)org> |
Subject: | Re: SQL command speed |
Date: | 2000-05-19 21:39:01 |
Message-ID: | 9901.958772341@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-sql |
Kate Collins <klcollins(at)wsicorp(dot)com> writes:
> You are correct in your assessment of the nature of the distribution
> of the elements in the item_a column. Some values return none or a
> few rows. While other values return many rows (100 or more). It is
> not an even distribution. I had never really considered what effect
> this would have on searching before.
Apparently you have at least one value with close to 2000 rows (IIRC,
the rows estimate is basically half the MCV's row count, except in
the special case where the planner can see that you are comparing
against the MCV itself).
If that's a real data value and not a dummy, then of course you can't
replace it by NULL. In that case I'm kind of stuck for a good answer
for 7.0; we can't do much to distinguish the few-distinct-values case
from the many-distinct-values-with-some-frequency-outliers case until
we have better statistics than 7.0 keeps. What you could do as a
stopgap is to force the system to use an indexscan against its
judgment, by doing this before issuing the problem query:
SET enable_seqscan = OFF;
Naturally I recommend keeping this ON most of the time, but sometimes
you just have to get out the blunt instruments ;-)
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Peter Eisentraut | 2000-05-20 13:35:58 | Re: Question about databases in alternate locations... |
Previous Message | Tom Lane | 2000-05-19 19:48:46 | Re: Creating Tables in Functions |