Re: Incremental sort for access method with ordered scan support (amcanorderbyop)

From: "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
To: Richard Guo <guofenglinux(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Miroslav Bendik <miroslav(dot)bendik(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Incremental sort for access method with ordered scan support (amcanorderbyop)
Date: 2023-07-13 13:20:33
Message-ID: 3122b1ba-b5e5-ad92-0894-b7a8c639f1ff@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 7/5/23 2:15 AM, Richard Guo wrote:
>
> On Tue, Jul 4, 2023 at 7:15 PM David Rowley <dgrowleyml(at)gmail(dot)com
> <mailto:dgrowleyml(at)gmail(dot)com>> wrote:
>
> On Tue, 4 Jul 2023 at 20:12, Richard Guo <guofenglinux(at)gmail(dot)com
> <mailto:guofenglinux(at)gmail(dot)com>> wrote:
> > The v4 patch looks good to me (maybe some cosmetic tweaks are still
> > needed for the comments).  I think it's now 'Ready for Committer'.
>
> I agree. I went and hit the comments with a large hammer and while
> there also adjusted the regression tests. I didn't think having "t" as
> a table name was a good idea as it seems like a name with a high risk
> of conflicting with a concurrently running test. Also, there didn't
> seem to be much need to insert data into that table as the tests
> didn't query any of it.
>
> The only other small tweak I made was to not call list_copy_head()
> when the list does not need to be shortened. It's likely not that
> important, but if the majority of cases are not partial matches, then
> we'd otherwise be needlessly making copies of the list.
>
> I pushed the adjusted patch.
>
>
> The adjustments improve the patch a lot.  Thanks for adjusting and
> pushing the patch.

Thanks for working on this! While it allows the planner to consider
choosing an incremental sort for indexes that implement
"amcanorderbyop", it also has a positive side-effect that the planner
will also consider choosing a plan for spawning parallel workers!

Because of that, I'd like to open the discussion that we consider
backpatching this. Currently, extensions that implement index access
methods (e.g. pgvector[1]) that are built primarily around
"amcanorderbyop" are unable to get the planner to consider choosing a
parallel scan, i.e. at this point in "create_order_paths"[2]:

/*
* If cheapest partial path doesn't need a sort, this is redundant
* with what's already been tried.
*/
if (!pathkeys_contained_in(root->sort_pathkeys,
cheapest_partial_path->pathkeys))

However, 625d5b3c does unlock this path for these types of indexes to
allow for a parallel index scan to be chosen, which would allow
extensions that implement a "amcanorderbyop" scan to use it. I would
argue that this is a bug, given we offer the ability for index access
methods to implement parallel index scans.

That said, I do think they may still need to be one planner tweak to
properly support parallel index scan in this case, as I have yet to see
costs generated where the parallel index scan is cheaper. However, I
have not yet narrowed what/where that is.

Thanks,

Jonathan

[1] https://github.com/pgvector/pgvector
[2]
https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/optimizer/plan/planner.c;#l5188

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Dave Cramer 2023-07-13 13:53:55 Re: CommandStatus from insert returning when using a portal.
Previous Message Amit Kapila 2023-07-13 13:06:49 Re: doc: clarify the limitation for logical replication when REPILICA IDENTITY is FULL