Re: "could not find pathkey item to sort" for TPC-DS queries 94-96

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: James Coleman <jtc331(at)gmail(dot)com>, Luc Vlaming <luc(at)swarm64(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: "could not find pathkey item to sort" for TPC-DS queries 94-96
Date: 2021-04-17 19:39:45
Message-ID: 355670.1618688385@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

[ sorry for not getting to this thread till now ]

Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> writes:
> 3) Shouldn't find_em_expr_usable_for_sorting_rel now mostly mimic what
> prepare_sort_from_pathkeys does? That is, try to match the entries
> directly first, before the new pull_vars() business?

Yeah. I concur that the problem here is that
find_em_expr_usable_for_sorting_rel isn't fully accounting for what
prepare_sort_from_pathkeys can and can't do. However, I don't like this
patch much:

* As written, I think it may just move the pain somewhere else. The point
of the logic in prepare_sort_from_pathkeys is to handle either full
expression matches (e.g. sort by "A+B" when "A+B" is an expression in
the input tlist) or computable expressions (sort by "A+B" when A and B
are individually available). I think you've fixed the second case and
broken the first one. Now it's possible that the case never arises,
and certainly failing to generate an early sort isn't catastrophic anyway.
But we ought to get it right.

* If the goal is to match what prepare_sort_from_pathkeys can do, I
think that doubling down on the strategy of having a duplicate copy
is not the path to a maintainable fix.

I think it's time for some refactoring of this code so that we can
actually share the logic. Accordingly, I propose the attached.
It's really not that hard to share, as long as you accept the idea
that the list passed to the shared subroutine can be either a list of
TargetEntries or of bare expressions.

Also, I don't much care for either the name or API of
find_em_expr_usable_for_sorting_rel. The sole current caller only
really needs a boolean result, and if it did need more than that
it'd likely need the whole EquivalenceMember not just the em_expr
(certainly createplan.c does). So 0002 attached is some bikeshedding
on that. I kept that separate because it might be wise to do it only
in HEAD, just in case somebody out there is calling the function from
an extension.

(BTW, responding to an upthread question: I think the looping to
remove multiple levels of RelabelType is probably now redundant,
but I didn't remove it. If we want to do that there are more
places to touch than just this code.)

regards, tom lane

Attachment Content-Type Size
0001-Fix-find_em_expr_usable_for_sorting_rel-2.patch text/x-diff 19.6 KB
0002-change-function-name.patch text/x-diff 5.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Dilger 2021-04-17 19:43:21 Re: pg_amcheck option to install extension
Previous Message David G. Johnston 2021-04-17 19:24:57 Re: feature request ctid cast / sql exception