From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Bruno Wolff III <bruno(at)wolff(dot)to> |
Cc: | pgsql-performance(at)postgresql(dot)org |
Subject: | Re: View performance |
Date: | 2002-12-26 23:43:39 |
Message-ID: | 20650.1040946219@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
I wrote:
>> This seems to indicate some estimation problems in cost_hashjoin; the
>> estimated cost for the hashjoin is evidently a lot higher than it should
>> be.
The answer is that estimate_hash_bucketsize() is producing a rather
silly result in this situation, viz. a bucketsize "fraction" that's well
above 1.0. I've applied the following band-aid patch to CVS tip, which
perhaps you might like to use locally. But probably the long-range
answer is to rethink what that routine is doing --- its adjustment for
skewed data distributions is perhaps not such a great idea.
regards, tom lane
*** src/backend/optimizer/path/costsize.c.orig Fri Dec 13 19:17:55 2002
--- src/backend/optimizer/path/costsize.c Thu Dec 26 18:34:02 2002
***************
*** 1164,1169 ****
--- 1164,1179 ----
if (avgfreq > 0.0 && mcvfreq > avgfreq)
estfract *= mcvfreq / avgfreq;
+ /*
+ * Clamp bucketsize to sane range (the above adjustment could easily
+ * produce an out-of-range result). We set the lower bound a little
+ * above zero, since zero isn't a very sane result.
+ */
+ if (estfract < 1.0e-6)
+ estfract = 1.0e-6;
+ else if (estfract > 1.0)
+ estfract = 1.0;
+
ReleaseSysCache(tuple);
return (Selectivity) estfract;
From | Date | Subject | |
---|---|---|---|
Next Message | yutaka_inada | 2002-12-27 07:03:48 | executing pgsql on Xeon-dual machine |
Previous Message | Tom Lane | 2002-12-26 20:45:55 | Re: View performance |