From: | Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru> |
---|---|
To: | Andrei Lepikhov <lepihov(at)gmail(dot)com> |
Cc: | Andy Fan <zhihuifan1213(at)163(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, Nikita Malakhov <HukuToc(at)gmail(dot)com> |
Subject: | Re: MergeAppend could consider sorting cheapest child path |
Date: | 2025-04-25 09:16:29 |
Message-ID: | 96e6b45ca97ac61657ca8c4d65e135f8@postgrespro.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Andrei Lepikhov писал(а) 2025-04-24 16:01:
> On 3/28/25 09:19, Alexander Pyhalov wrote:
>> Andy Fan писал(а) 2024-10-17 03:33:
>> I've updated patch. One more interesting case which we found - when
>> fractional path is selected, it still can be more expensive than
>> sorted cheapest total path (as we look only on indexes whith necessary
>> pathkeys, not on indexes which allow efficiently fetch data).
>> So far couldn't find artificial example, but we've seen inadequate
>> index selection due to this issue - instead of using index suited for
>> filters in where, index, suitable for sorting was selected as one
>> having the cheapest fractional cost.
> I think it is necessary to generalise the approach a little.
>
> Each MergeAppend subpath candidate that fits pathkeys should be
> compared to the overall-optimal path + explicit Sort node. Let's label
> this two-node composition as base_path. There are three cases exist:
> startup-optimal, total-optimal and fractional-optimal.
> In the startup case, base_path should use cheapest_startup_path, the
> total-optimal case - cheapest_total_path and for a fractional case, we
> may employ the get_cheapest_fractional_path routine to detect proper
> base_path.
>
> It may provide a more effective plan either in full, fractional and
> partial scan cases:
> 1. The Sort node may be pushed down to subpaths under a parallel or
> async Append.
> 2. When a minor set of subpaths doesn't have a proper index, and it is
> profitable to sort them instead of switching to plain Append.
>
> In general, analysing the regression tests changed by this code, I see
> that the cost model prefers a series of small sortings instead of a
> single massive one. May be it will show some profit for execution time.
>
> I am not afraid of any palpable planning overhead here because we just
> do cheap cost estimation and comparison operations that don't need
> additional memory allocations. The caller is responsible for building a
> proper Sort node if this method is chosen as optimal.
>
> In the attachment, see the patch written according to the idea. There
> are I introduced two new routines:
> get_cheapest_path_for_pathkeys_ext
> get_cheapest_fractional_path_for_pathkeys_ext
Hi. I'm a bit confused that
get_cheapest_fractional_path_for_pathkeys_ext() looks only on sorting
cheapest fractional path, and get_cheapest_path_for_pathkeys_ext() in
STARTUP_COST case looks only on sorting cheapest_startup_path.
Usually, sorted cheapest_total_path will be cheaper than sorted
fractional/startup path at least by startup cost (as after sorting it
includes total_cost of input path). But we ignore this case when
selecting cheapest_startup and cheapest_fractional subpaths. As result
selected cheapest_startup and cheapest_fractional can be not cheapest
for startup or selecting a fraction of rows.
Consider the partition with the following access paths:
1) cheapest_startup without required pathkeys:
startup_cost: 0.42
total_cost: 4004
2) some index_path with required pathkeys:
startup_cost: 6.6
total_cost: 2000
3) cheapest_total_path:
startup_cost: 0.42
total_cost: 3.48
Here, when selecting cheapest startup subpath we'll compare costs of
index path (2) and sorted cheapest_startup (1), but will ignore sorted
cheapest_total_path (3), despite the fact that it really can be the
cheapest startup path, providing required sorting order.
--
Best regards,
Alexander Pyhalov,
Postgres Professional
From | Date | Subject | |
---|---|---|---|
Next Message | George MacKerron | 2025-04-25 10:20:55 | Re: Making sslrootcert=system work on Windows psql |
Previous Message | Amit Kapila | 2025-04-25 09:08:51 | Re: long-standing data loss bug in initial sync of logical replication |