From: | Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
---|---|
To: | Andrei Lepikhov <lepihov(at)gmail(dot)com> |
Cc: | Nikita Malakhov <hukutoc(at)gmail(dot)com>, Andy Fan <zhihuifan1213(at)163(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Rowley <dgrowleyml(at)gmail(dot)com> |
Subject: | Re: Considering fractional paths in Append node |
Date: | 2025-03-05 11:20:04 |
Message-ID: | CAPpHfdsyOv_t=VVZds-8mkEuwvx1326RGWoOzS9rgnA8uL5Y+w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Mar 5, 2025 at 8:32 AM Andrei Lepikhov <lepihov(at)gmail(dot)com> wrote:
> On 5/3/2025 03:27, Alexander Korotkov wrote:
> > On Mon, Mar 3, 2025 at 1:04 PM Andrei Lepikhov <lepihov(at)gmail(dot)com> wrote:
> >>> 2. As usage of root->tuple_fraction RelOptInfo it has been criticized,
> >>> do you think we could limit this to some simple cases? For instance,
> >>> check that RelOptInfo is the final result relation for given root.
> >> I believe that using tuple_fraction is not an issue. Instead, it serves
> >> as a flag that allows the upper-level optimisation to consider
> >> additional options. The upper-level optimiser has more variants to
> >> examine and will select the optimal path based on the knowledge
> >> available at that level. Therefore, we're not introducing a mistake
> >> here; we're simply adding extra work in the narrow case. However, having
> >> only the bottom-up planning process, I don't see how we could avoid this
> >> additional workload.
> >
> > Yes, but if we can assume root->tuple_fraction applies to result of
> > Append, it's strange we apply the same tuple fraction to all the child
> > rels. Latter rels should less likely be used at all and perhaps
> > should have less tuple_fraction.
> Of course, it may happen. But I'm not sure it is a common rule.
> Using LIMIT, we usually select data according to specific clauses.
> Imagine, we need TOP-100 ranked goods. Appending partitions of goods, we
> will use the index on the 'rating' column. But who knows how top-rated
> goods are spread across partitions? Maybe a single partition contains
> all of them? So, we need to select 100 top-rated goods from each partition.
> Hence, applying the same limit to each partition seems reasonable, right?
Ok, I didn't notice add_paths_to_append_rel() is used for MergeAppend
as well. I thought again about regular Append. If can have required
number of rows from the first few children relations, the error of
tuple fraction shouldn't influence plans much, and other children
relations wouldn't be used at all. But if we don't, we unlikely get
prediction of selectivity accurate enough to predict which exact
children relations are going to be used. So, usage root tuple
fraction for every child relation would be safe. So, this approach
makes sense to me.
------
Regards,
Alexander Korotkov
Supabase
From | Date | Subject | |
---|---|---|---|
Next Message | Jim Jones | 2025-03-05 11:20:42 | Re: Commit fest 2025-03 |
Previous Message | Matthias van de Meent | 2025-03-05 11:12:57 | Re: Incorrect result of bitmap heap scan. |