From: | Joshua Tolley <eggyknap(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Bryce Cutt <pandasuit(at)gmail(dot)com>, "Lawrence, Ramon" <ramon(dot)lawrence(at)ubc(dot)ca>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets |
Date: | 2008-12-23 18:28:19 |
Message-ID: | 20081223182818.GA5867@uber |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Dec 23, 2008 at 10:14:29AM -0500, Robert Haas wrote:
> > It's equivalent to our assumption that distributions of values in
> > columns in the same table are independent. Making that assumption in
> > this case would probably result in occasional dramatic speed
> > improvements similar to the ones we've seen in less complex joins,
> > offset by just-as-occasional dramatic slowdowns of similar magnitude. In
> > other words, it will increase the variance of our results.
>
> Under what circumstances do you think that it would produce a dramatic
> slowdown? I'm confused. I thought the penalty for picking a bad set
> of values for the in-memory hash table was pretty small.
>
> ...Robert
I take that back :) I agree with what others have already said, that it
shouldn't cause dramatic slowdowns when we get it wrong.
- Josh
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff Davis | 2008-12-23 18:34:41 | Re: Lock conflict behavior? |
Previous Message | Lawrence, Ramon | 2008-12-23 18:12:22 | Re: Proposed Patch to Improve Performance of Multi-BatchHash Join for Skewed Data Sets |