Re: Multi-pass planner

From: Greg Stark <stark(at)mit(dot)edu>
To: decibel <decibel(at)decibel(dot)org>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Multi-pass planner
Date: 2013-04-04 01:40:50
Message-ID: CAM-w4HMmxpmM0BfaS53YwU37q4wPFTp0zohuEwgD0vGven8=tQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Aug 21, 2009 at 6:54 PM, decibel <decibel(at)decibel(dot)org> wrote:

> Would it? Risk seems like it would just be something along the lines of
> the high-end of our estimate. I don't think confidence should be that hard
> either. IE: hard-coded guesses have a low confidence. Something pulled
> right out of most_common_vals has a high confidence. Something estimated
> via a bucket is in-between, and perhaps adjusted by the number of tuples.
>

I used to advocate a similar idea. But when questioned on list I tried to
work out the details and ran into some problem coming up with a concrete
plan.

How do you compare a plan that you think has a 99% chance of running in 1ms
but a 1% chance of taking 1s against a plan that has a 90% chance of 1ms
and a 10% chance of taking 100ms? Which one is actually riskier? They might
even both have the same 95% percentile run-time.

And additionally there are different types of unknowns. Do you want to
treat plans where we have a statistical sample that gives us a
probabilistic answer the same as plans where we think our model just has a
10% chance of being wrong? The model is going to either be consistently
right or consistently wrong for a given query but the sample will vary from
run to run. (Or vice versa depending on the situation).

--
greg

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Smith 2013-04-04 01:49:20 Re: Page replacement algorithm in buffer cache
Previous Message Andres Freund 2013-04-04 01:39:00 Re: corrupt pages detected by enabling checksums