From: | Bykov Ivan <I(dot)Bykov(at)modernsys(dot)ru> |
---|---|
To: | Andy Fan <zhihuifan1213(at)163(dot)com> |
Cc: | "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org> |
Subject: | RE: [PoC] Partition path cache |
Date: | 2024-10-29 06:27:25 |
Message-ID: | a6ffc8ebcabf4e169b5150dbe61a0ae7@localhost.localdomain |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello
> This sounds like an interesting idea, I like it because it omit the needs for "global statistics" effort for partitioned table since it just use the first partition it knows. Of couse it has its drawback that "first"
> partition can't represent other partitions.
This method uses global statistics for all partitions.
The cache uses standard path building functions (it calculates selectivity for path), but it avoids calling all of them for the second and later partitions in a group.
The concept is similar to the GEQO method used for joins.
We skip creating some path variants if building all paths would take too long.
> One of the Arguments of this patch might be "What if other partitions have a pretty different statistics from the first partition?". If I were you, I might check all the used statistics on this stage and try to find out a similar algorithms to > prove that the best path would be similar too. This can happens once when the statistics is gathered. However this might be not easy.
Yes, maybe we can split partitions by groups not only by available index lists but also by some statistical property ranges.
--
Best Regards
Ivan Bykov
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2024-10-29 06:28:24 | Re: Pgoutput not capturing the generated columns |
Previous Message | Daniil Davydov | 2024-10-29 06:21:40 | Re: Forbid to DROP temp tables of other sessions |