Re: enforcing a plan (in brief)

From: pgsql(at)mohawksoft(dot)com
To: "Neil Conway" <neilc(at)samurai(dot)com>
Cc: "Bruce Momjian" <pgman(at)candle(dot)pha(dot)pa(dot)us>, "Hicham G(dot) Elmongui" <elmongui(at)cs(dot)purdue(dot)edu>, "pgsql-hackers" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: enforcing a plan (in brief)
Date: 2005-02-14 20:54:48
Message-ID: 16803.24.91.171.78.1108414488.squirrel@mail.mohawksoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Thu, 2005-02-10 at 14:37 -0500, Bruce Momjian wrote:
>> No, we feel that is of limited value. If the optimizer isn't doing
>> things properly, we will fix it.
>
> I agree that improving the optimizer is the right answer for normal
> usage, so I can't get excited about query-level plan hints, but I can
> see the capability to instruct the planner being useful in an academic
> context.
>

I think that is sort of arrogant. Look at Oracle, you can give the planner
hints in the form of comments.

The idea that constructing a planner that will always do the best job is
like creating a program that can predict the weather. There are too many
subtle variations in datasets that are impossible to really evalute. I
posted a message last week called "One Big trend vs multiple smaller
trends." and you'll see what I mean.

Yea, on a simple data organization, you could make a great planner, but
someone who has studied the nature of their data can almost always toss
their hands up in frustration because the planner isn't working right.

I have had multiple issues with the inability to guide the planner on its
decisions. I'll give a couple examples:

A music database where the artist name is "Various Artists," given any
normal database of music recordings, "Various Artists" will be *THE* most
popular artist, usually close to almost half the data. Most of the time
I've had to turn off sequential scans for these queries. (I filter out
"various artists") Being able to say:

select * from cdtitles where artist = 'foo' /* index scan
ctitles_artist_ndx */ ;

Would be helpful as disabling sequential scan isn't always the right think
either.

The whole "Query optimizer 8.0.1 (and 8.0)" series of posts show a
different problem.

It all comes down to that the planner *can not* be perfect, and thus will
always be lacking in some respect. This is because you can not anticipate
every physical data storage pattern, therefore, the analyzer will not
correctly characterize them, and the planner will not create an optimal
plan.

Allowing the user to suggest alternate query strategies is a good idea.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John Hansen 2005-02-14 21:36:44 Re: Schema name of function
Previous Message Tom Lane 2005-02-14 19:40:34 Re: Help me recovering data