RE: [PoC] Partition path cache

From: Bykov Ivan <I(dot)Bykov(at)modernsys(dot)ru>
To: Andy Fan <zhihuifan1213(at)163(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: [PoC] Partition path cache
Date: 2024-10-29 06:27:25
Message-ID: a6ffc8ebcabf4e169b5150dbe61a0ae7@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello

> This sounds like an interesting idea, I like it because it omit the needs for "global statistics" effort for partitioned table since it just use the first partition it knows. Of couse it has its drawback that "first"
> partition can't represent other partitions.

This method uses global statistics for all partitions.
The cache uses standard path building functions (it calculates selectivity for path), but it avoids calling all of them for the second and later partitions in a group.

The concept is similar to the GEQO method used for joins.
We skip creating some path variants if building all paths would take too long.

> One of the Arguments of this patch might be "What if other partitions have a pretty different statistics from the first partition?". If I were you, I might check all the used statistics on this stage and try to find out a similar algorithms to > prove that the best path would be similar too. This can happens once when the statistics is gathered. However this might be not easy.

Yes, maybe we can split partitions by groups not only by available index lists but also by some statistical property ranges.

--
Best Regards
Ivan Bykov

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2024-10-29 06:28:24 Re: Pgoutput not capturing the generated columns
Previous Message Daniil Davydov 2024-10-29 06:21:40 Re: Forbid to DROP temp tables of other sessions