From: | Andrei Lepikhov <lepihov(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Subject: | Re: allowing extensions to control planner behavior |
Date: | 2024-10-23 08:51:02 |
Message-ID: | deb87eba-d4e2-40ad-84e9-219a25516b2d@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 10/23/24 15:05, Robert Haas wrote:
> On Sat, Oct 19, 2024 at 6:00 AM Andrei Lepikhov <lepihov(at)gmail(dot)com> wrote:
>> Generally, a hash value doesn't 100% guarantee the uniqueness of a node
>> identification. Also, RelOptInfo corresponds to a subtree in the final
>> plan, and sometimes, it takes work to find which node in the partially
>> executed plan corresponds to this specific estimation on row number
>> during selectivity estimation. Remember parameterised paths - you should
>> attach some signature for each path. So, it is not fully strict method.
>> If you are interested, I can perhaps explain the method a little bit
>> more at some meetup.
>
> Yeah, I agree that this is not the best method. While it's true that
> you could get a false match in case of a hash value collision, IMHO
> the bigger problem is that it seems like an expensive way of
> determining something that we really should know already. If the user
> types the same query, mentioning the same relations, in the same
> order, with the same constructs around them, it's hard to believe that
> hashing is the cheapest way of matching up the old and new ones. I'm
> not sure exactly what we should do instead, but it feels like we more
> or less have this information during parsing and then we lose track of
> it as the query goes through the rewrite and planning phases.
Parse tree may be implemented with multiple execution plans. Even
clauses can be transformed during optimisation (Remember OR -> ANY).
Also, the cardinality of a middle-tree join depends on the inner and
outer subtrees. Because of that, having a hash on RelOptInfo's relids
and restrictions + hashes of child RelOptInfos and carrying it through
all other stages up to the end of execution is the most stable approach
I know.
--
regards, Andrei Lepikhov
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Langote | 2024-10-23 09:07:36 | Re: Remove unnecessary word in a comment |
Previous Message | Amit Kapila | 2024-10-23 08:37:53 | Re: Pgoutput not capturing the generated columns |