From: | Martijn van Oosterhout <kleptog(at)svana(dot)org> |
---|---|
To: | Jeff Davis <pgsql(at)j-davis(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org, npboley(at)gmail(dot)com |
Subject: | Re: new correlation metric |
Date: | 2008-10-26 13:49:44 |
Message-ID: | 20081026134943.GA8427@svana.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Oct 26, 2008 at 01:38:02AM -0700, Jeff Davis wrote:
> I worked with Nathan Boley to come up with what we think is a better
> metric for measuring this cost. It is based on the number of times in
> the ordered sample that you have to physically backtrack (i.e. the data
> value increases, but the physical position is earlier).
>
> For example, if the table's physical order is
>
> 6 7 8 9 10 1 2 3 4 5
How does it deal with a case like the following:
1 6 2 7 3 8 4 9 5 10 (interleaving)
ISTM that your code will overestimate the cost whereas the old code
wouldn't have done too badly.
I think the code is in the right direction, but I think want you want
is some kind of estimate of "given I've looked for tuple X, how many
tuples in the next k pages are near this one". Unfortunatly I don't see
a way of calculating it other than a full simulation.
Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.
From | Date | Subject | |
---|---|---|---|
Next Message | Jeff | 2008-10-26 15:47:24 | Re: again... |
Previous Message | Pavel Stehule | 2008-10-26 11:28:52 | WIP: default values for function parameters |