From: | Tobias Zahn <tobias-zahn(at)arcor(dot)de> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: GEQO: ERX |
Date: | 2009-05-13 20:14:25 |
Message-ID: | 4A0B2A21.1000908@arcor.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello,
thank you for posting the paper, it was quite interesting to read. I
think it would be a good idea to give the two-phase optimization a try.
As far as I know and understand the current (geqo) optimizer source,
many important parts are already there. For example, we can calculate
the costs of a given join order. Therefore, it would only be necessary
to write an algorithm witch chooses the right input for the cost function.
I would be interested in your opinion.
Regards,
Tobias
Robert Haas schrieb:
> On Sat, May 2, 2009 at 11:37 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> Tobias Zahn <tobias-zahn(at)arcor(dot)de> writes:
>>> I didn't not get any response to my initial message below. Now I am
>>> wondering if nobody is into the optimizer or if my question was just too
>>> stupid. Could you please give me some clues? Your help would really be
>>> appreciated.
>> Well, nobody's into GEQO very much. I took a quick look and didn't
>> think that deleting the ERX support would save anything noticeable,
>> but you're welcome to try it if you think different.
>>
>> The real problem with working on GEQO, in my humble opinion, is that
>> it's throwing good effort after bad. That module doesn't need marginal
>> fixing, it needs throwing away and rewriting from scratch. Bad enough
>> that it's convoluted and full of dead (experimental?) code; but I don't
>> even believe that it's based on a good analogy. The planning problem
>> is not all that much like traveling salesman problems, so heuristics
>> designed for TSP are of pretty questionable usefulness to start with.
>> That complaint could have been refuted if the module performed well,
>> but in fact if you check the archives you'll find many many complaints
>> about it --- its ability to find good plans seems to be mostly dependent
>> on luck.
>>
>> My knowledge of AI search algorithms is about 20 years obsolete, but
>> last I heard simulated annealing had overtaken genetic algorithms for
>> many purposes. It might be interesting to try a rewrite based on SA;
>> or maybe there's something better out there now.
>
> There's a 1997 article on this topic that's pretty interesting.
>
> Heuristic and randomized optimization for the join ordering problem
> http://reference.kfupm.edu.sa/content/h/e/heuristic_and_randomized_optimization_fo_87585.pdf
>
> Here's the basic conclusion:
>
> "If good solutions are of highest importance, Two-Phase Optimization,
> the algorithm that performed best in our experiments, is a very good
> choice; other Simulated Annealing variants, for instance Toured
> Simulated Annealing (TSA, LVZ93]), that we did not implement, are
> likely to achieve quite similar results. The 'pure' Simulated
> Annealing algorithm has a much higher running time without yielding
> significantly better solutions. If short running time is more
> important, Iterative Improvement (IIIO), the genetic algo- rithm
> (BushyGenetic), and, to a lesser extent, Two-Phase Optimization (2PO)
> are feasible alternatives."
>
> I'm not sure if there's anything more recent out there.
>
> ...Robert
>
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2009-05-13 20:30:19 | Re: libxml incompatibility |
Previous Message | Kevin Field | 2009-05-13 19:45:59 | Re: pg_views definition format |