From: | James Coleman <jtc331(at)gmail(dot)com> |
---|---|
To: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Shaun Thomas <shaun(dot)thomas(at)2ndquadrant(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Andreas Karlsson <andreas(at)proxel(dot)se> |
Subject: | Re: [PATCH] Incremental sort (was: PoC: Partial sort) |
Date: | 2020-04-02 02:09:20 |
Message-ID: | CAAaqYe8fWa8pBUStzW-4JocF+d3pFBVLFmb9rtcD0m12NskBJw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Apr 1, 2020 at 5:42 PM Tomas Vondra
<tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> ...
> I've realized the way get_useful_pathkeys_for_relation() is coded kinda
> works against the fastpath we added when comparing pathkeys. That
> depends on comparing pointers to the list, but we've been building new
> lists (and then returned those) which defeats the optimization. Attached
> is a patch that returns the original list in most cases (and only
> creates a copy when really necessary). This might also save a few cycles
> on bulding the new list, of course.
>
> I've done a bunch of read-only pgbench tests with fairly small scales (1
> and 10). First with the built-in read-only transaction, and also with a
> simple custom query doing an order-by. And I did this both on the
> default schema and with a bunch of extra indexes. The script I used to
> run this is attached, along with a summary of results.
>
> There are results for master and v40 and v50 patches (the v50 also
> includes the extra patch fixing get_useful_pathkeys_for_relation).
>
> Overall, I'm happy with those results - the v50 seems to be within 1% of
> master, in both directions. This very much seems like a noise.
>
> I still want to do a bit more review of the costing / tuplesort changes,
> which I plan to do tomorrow. If that goes well, I plan to start
> committing this. So please if you think this is not ready or wants more
I think we need to either implement this or remove the comment:
* XXX I wonder if we need to consider adding a projection here, as
* create_ordered_paths does.
in generate_useful_gather_paths().
In the same function we have the following code:
/*
* When the partial path is already sorted, we can just add a gather
* merge on top, and we're done - no point in adding explicit sort.
*
* XXX Can't we skip this (maybe only for the cheapest partial path)
* when the path is already sorted? Then it's likely duplicate with
* the path created by generate_gather_paths.
*/
if (is_sorted)
{
path = create_gather_merge_path(root, rel, subpath, rel->reltarget,
subpath->pathkeys, NULL, rowsp);
add_path(rel, &path->path);
continue;
}
looking at the relevant loop in generate_gather_paths:
/*
* For each useful ordering, we can consider an order-preserving Gather
* Merge.
*/
foreach(lc, rel->partial_pathlist)
{
Path *subpath = (Path *) lfirst(lc);
GatherMergePath *path;
if (subpath->pathkeys == NIL)
continue;
rows = subpath->rows * subpath->parallel_workers;
path = create_gather_merge_path(root, rel, subpath, rel->reltarget,
subpath->pathkeys, NULL, rowsp);
add_path(rel, &path->path);
}
I believe we can eliminate the block entirely in
generate_useful_gather_paths(). Here's my reasoning: all paths for
which is_sorted is true must necessarily have pathkeys, and since we
already add a gather merge for every subpath with pathkeys, we've
already added gather merge paths for all of these.
I've included a patch to change this, but let me know if the reasoning
isn't sound.
We can also remove the XXX on this comment (in the same function):
* XXX This is not redundant with the gather merge path created in
* generate_gather_paths, because that merely preserves ordering of
* the cheapest partial path, while here we add an explicit sort to
* get match the useful ordering.
because of this code in generate_gather_paths():
cheapest_partial_path = linitial(rel->partial_pathlist);
rows =
cheapest_partial_path->rows * cheapest_partial_path->parallel_workers;
simple_gather_path = (Path *)
create_gather_path(root, rel, cheapest_partial_path, rel->reltarget,
NULL, rowsp);
add_path(rel, simple_gather_path);
but we can cleanup the comment a bit: fix the grammar issue in the
last line and fix the reference to gather merge path (it's a gather
path).
I've included that in the same patch.
I also noticed that in create_incremental_sort_path we have this:
/* XXX comparison_cost shouldn't be 0? */
but I guess that's part of what you're reviewing tomorrow.
> time for a review, let me know. I'm not yet sure if I'll commit this as
> a single change, or in three separate commits.
I don't love the idea of committing it as a single patch, but at least
the first two I think probably go together. Otherwise we're
introducing a "fix" with no proven impact that will slow down planning
(even if only in a small way) only to intend to condition that on a
GUC in the next commit.
But I think you could potentially make an argument for keeping the
additional paths separate...but it's not absolutely necessary IMO.
> James, can you review the proposed extra fix and merge the fixes into
> the main patches?
I've reviewed it, and it looks correct, so merged into the main series.
Summary:
The attached series includes a couple of XXX fixes or comment cleanup
as noted above. I believe there are two more XXXs that needs to be
answered before we merge ("do we need to consider adding a projection"
and "what is the comparison cost for incremental sort").
James
Attachment | Content-Type | Size |
---|---|---|
v51-0001-Consider-low-startup-cost-when-adding-partial-pa.patch | text/x-patch | 4.6 KB |
v51-0003-Consider-incremental-sort-paths-in-additional-pl.patch | text/x-patch | 26.2 KB |
v51-0004-cleanup-xxx-in-generate_useful_gather_paths.patch | text/x-patch | 2.3 KB |
v51-0002-Implement-incremental-sort.patch | text/x-patch | 171.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2020-04-02 02:29:04 | Re: WIP: BRIN multi-range indexes |
Previous Message | Shinoda, Noriyoshi (PN Japan A&PS Delivery) | 2020-04-02 02:04:10 | RE: SLRU statistics |