On 15/10/2024 12:15, David Rowley wrote:
> On Tue, 15 Oct 2024 at 17:48, Andrei Lepikhov <lepihov(at)gmail(dot)com> wrote:
> I think maybe what is worth working on is seeing if you can better
> estimate the number of tiebreak comparisons by checking the number of
> distinct values for each sort key. That might require what we
> discussed on another thread about having estimate_num_groups() better
> use the n_distinct estimate from the EquivalenceMember with the least
> distinct values. It'll be another question if all that can be done
> and this all still perform well enough for cost_sort(). You may have
> to write it first before we can figure that out. It may be possible
> to buy back some of the overheads with some additional caching...
> Perhaps that could be done within the EquivalenceClass.
It seems that your idea with an equivalence class works.
In the patchset attached, I show all the ideas (I initially wanted to
discuss it step-by-step).
In the attached, the first patch is about choosing adequate expression
from the list of EC members.
The second one is the current patch, which considers the number of sort
columns.
Third patch - elaboration of the second patch. Use only the trustful
statistics here - the number of ndistinct values on the first sorted column.
And the last patch is a demo of how I'm going to use the previous three
patches and add one more strategy to improve the order of columns in the
GROUP-BY clause.
--
regards, Andrei Lepikhov