From: | Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
---|---|
To: | Andrei Lepikhov <lepihov(at)gmail(dot)com> |
Cc: | Nikita Malakhov <hukutoc(at)gmail(dot)com>, Andy Fan <zhihuifan1213(at)163(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, David Rowley <dgrowleyml(at)gmail(dot)com> |
Subject: | Re: Considering fractional paths in Append node |
Date: | 2025-03-09 11:44:58 |
Message-ID: | CAPpHfdt2oOoarXM71TNL+Ud71DLG506ScpomfNu0LYciedT3ig@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Mar 5, 2025 at 1:20 PM Alexander Korotkov <aekorotkov(at)gmail(dot)com> wrote:
> On Wed, Mar 5, 2025 at 8:32 AM Andrei Lepikhov <lepihov(at)gmail(dot)com> wrote:
> > On 5/3/2025 03:27, Alexander Korotkov wrote:
> > > On Mon, Mar 3, 2025 at 1:04 PM Andrei Lepikhov <lepihov(at)gmail(dot)com> wrote:
> > >>> 2. As usage of root->tuple_fraction RelOptInfo it has been criticized,
> > >>> do you think we could limit this to some simple cases? For instance,
> > >>> check that RelOptInfo is the final result relation for given root.
> > >> I believe that using tuple_fraction is not an issue. Instead, it serves
> > >> as a flag that allows the upper-level optimisation to consider
> > >> additional options. The upper-level optimiser has more variants to
> > >> examine and will select the optimal path based on the knowledge
> > >> available at that level. Therefore, we're not introducing a mistake
> > >> here; we're simply adding extra work in the narrow case. However, having
> > >> only the bottom-up planning process, I don't see how we could avoid this
> > >> additional workload.
> > >
> > > Yes, but if we can assume root->tuple_fraction applies to result of
> > > Append, it's strange we apply the same tuple fraction to all the child
> > > rels. Latter rels should less likely be used at all and perhaps
> > > should have less tuple_fraction.
> > Of course, it may happen. But I'm not sure it is a common rule.
> > Using LIMIT, we usually select data according to specific clauses.
> > Imagine, we need TOP-100 ranked goods. Appending partitions of goods, we
> > will use the index on the 'rating' column. But who knows how top-rated
> > goods are spread across partitions? Maybe a single partition contains
> > all of them? So, we need to select 100 top-rated goods from each partition.
> > Hence, applying the same limit to each partition seems reasonable, right?
>
> Ok, I didn't notice add_paths_to_append_rel() is used for MergeAppend
> as well. I thought again about regular Append. If can have required
> number of rows from the first few children relations, the error of
> tuple fraction shouldn't influence plans much, and other children
> relations wouldn't be used at all. But if we don't, we unlikely get
> prediction of selectivity accurate enough to predict which exact
> children relations are going to be used. So, usage root tuple
> fraction for every child relation would be safe. So, this approach
> makes sense to me.
I've slightly revised the commit message and comments. I'm going to
push this if no objections.
------
Regards,
Alexander Korotkov
Supabase
Attachment | Content-Type | Size |
---|---|---|
v3-0001-Teach-Append-to-consider-tuple_fraction-when-accu.patch | application/x-patch | 11.8 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | vignesh C | 2025-03-09 12:10:41 | Re: Commit fest 2025-03 |
Previous Message | vignesh C | 2025-03-09 11:42:10 | Re: [Doc] Improve hostssl related descriptions and option presentation |