From: | Merlin Moncure <mmoncure(at)gmail(dot)com> |
---|---|
To: | Claudio Freire <klaussfreire(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jim Nasby <jim(at)nasby(dot)net>, Stephen Frost <sfrost(at)snowman(dot)net>, David Johnston <polobo(at)yahoo(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Planner hints in Postgresql |
Date: | 2014-03-18 19:48:26 |
Message-ID: | CAHyXU0wuzc8mnd=+DWaP088-zTp=5qSv2jsiZ2gbeF3OA7ojCg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Mar 18, 2014 at 11:53 AM, Claudio Freire <klaussfreire(at)gmail(dot)com> wrote:
>
> On Mon, Mar 17, 2014 at 8:47 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>
>> Claudio Freire <klaussfreire(at)gmail(dot)com> writes:
>> > On Mon, Mar 17, 2014 at 7:01 PM, Jim Nasby <jim(at)nasby(dot)net> wrote:
>> >> Even better would be if the planner could estimate how bad a plan will
>> >> become if we made assumptions that turn out to be wrong.
>>
>> > That's precisely what risk estimation was about.
>>
>> Yeah. I would like to see the planner's cost estimates extended to
>> include some sort of uncertainty estimate, whereupon risk-averse people
>> could ask it to prefer low-uncertainty plans over high-uncertainty ones
>> (the plans we typically choose for ORDER BY ... LIMIT queries being great
>> examples of the latter). But it's a long way from wishing that to making
>> it so. Right now it's not even clear (to me anyway) how we'd measure or
>> model such uncertainty.
>
> Well, currently, selectivity estimates based on MCV should be pretty
> low-uncertainty, whereas certainty of other estimates could be modeled as a
> random variable if ANALYZE gathered a few statistical moments (for variables
> that are prone to that kind of statistical analysis).
Sure, plus as noted you have cases where the planer makes SWAGs. Each
of those SWAGs can introduce say (in the worst case) an order of
magnitude of error in the row count estimate.
> That alone could improve things considerably, and statistical info could be
> propagated along expressions to make it possible to model uncertainty in
> complex expressions as well.
But how would that work? I see no solution adumbrated there :-).
Let's say you change the rowcount estimate to low/bestguess/high *and*
you only engage extra searches when there is enough disparity between
those values you still get exponentially more searches. (my thinking
is that if bestguess estimated execution time is some user definable
amount faster then low/high at any node), a more skeptical plan is
introduced. All that could end up being pessimal to the general case
though.
merlin
From | Date | Subject | |
---|---|---|---|
Next Message | Petr Jelinek | 2014-03-18 19:49:23 | Re: plpgsql.warn_shadow |
Previous Message | Pavel Stehule | 2014-03-18 19:45:53 | Re: plpgsql.warn_shadow |