Re: Avoid computing ORDER BY junk columns unnecessarily

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
Cc: Xiaoran Wang <fanfuxiaoran(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Andrew Kane <andrew(at)ankane(dot)org>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Subject: Re: Avoid computing ORDER BY junk columns unnecessarily
Date: 2023-12-28 23:42:57
Message-ID: 2935085.1703806977@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> Yeah, fair point. I'll try to take a look at your patchset after
> the holidays.

I spent some time looking at this patch, and I'm not very pleased
with it. My basic complaint is that this is a band-aid that only
touches things at a surface level, whereas I think we need a much
deeper set of changes in order to have a plausible solution.
Specifically, you're only munging the final top-level targetlist
not anything for lower plan levels. That looks like it works in
simple cases, but it leaves everything on the table as soon as
there's more than one level of plan involved. I didn't stop to
trace the details but I'm pretty sure this is why you're getting the
bogus extra projections shown in the 0005 patch. Moreover, this
isn't doing anything for cost estimates, which means the planner
doesn't really understand that more-desirable plans are more
desirable, and it may not pick an available plan that would exploit
what we want to have happen here.

As an example, say we have an index on f(x) and the query requires
sorting by f(x) but not final output of f(x). If we devise a plan
that uses the index to sort and doesn't emit f(x), we need to not
charge the evaluation cost of f() for that plan --- this might
make the difference between picking that plan and not. Right now
I believe we'll charge all plans for f(), so that some other plan
might look cheaper even when f() is very expensive.

Another example: we might be using an indexscan but not relying on
its sort order, for example because its output is going into a hash
join and then we'll sort afterwards. For expensive f() it would
still be useful if the index could emit f(x) so that we don't have
to calculate f() at the sort step. Right now I don't think we can
even emit that plan shape, never mind recognize why it's cheaper.

I have only vague ideas about how to do this better. It might work
to set up multiple PathTargets for a relation that has such an
index, one for the base case where the scan only emits x and f() is
computed above, one for the case where we don't need either x or
f(x) because we're relying on the index order, and one that emits
f(x) with the expectation that a sort will happen above. Then we'd
potentially generate multiple Paths representing the very same
indexscan but with different PathTargets, and differing targets
would have to become something that add_path treats as a reason to
keep multiple Paths for the same relation. I'm a little frightened
about the possible growth in number of paths considered, but how
else would we keep track of the differing costs of these cases?

So your proposed patch is far from a final solution in this area,
and I'm not even sure that it represents a plausible incremental
step. I've got mixed feelings about the "make resjunk an enum"
idea --- it seems quite invasive and I'm not sure it's buying
enough to risk breaking non-planner code for. It might be
better to leave the targetlist representation alone and deal
with this strictly inside the planner. We could, for example,
add an array to PlannerInfo that's indexed by query-tlist resno
and indicates the reason for a particular tlist item to exist.

Also, as you mentioned, this categorization of junk columns
seems a little unprincipled. It might help to think about that
in terms similar to the "attr_needed" data we keep for Vars,
that is: how far up in the plan tree is this tlist item needed?
I think the possibilities are:

* needed for final output (ie, not resjunk)

* needed for final grouping/sorting steps (I think all
resjunk items produced by the parser are this case;
but do we need to distinguish GROUP BY from ORDER BY?)

* needed at the ModifyTable node (for rowmark columns)

* needed at the SetOp node (for flag columns added in
prepunion.c)

It looks to me like these cases are mutually exclusive, so
that we just need an enum value not a bitmask. Only
"needed for final grouping/sorting" is something we can
potentially omit from the plan.

BTW, the hack you invented JUNK_PLANNER_ONLY for, namely
that create_indexscan_plan feeds canreturn flags forward to
set_indexonlyscan_references, could be done another way:
we don't really have to overload resjunk for that. At worst,
set_indexonlyscan_references could re-look-up the IndexOptInfo
using the index OID from the plan node, and get the canreturn
flags directly from that. If we keep resjunk as a bool I'd be
inclined to leave this alone; but if we change resjunk, we
don't need the replacement design to account for this bit.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2023-12-28 23:55:08 Re: Statistics Import and Export
Previous Message Bruce Momjian 2023-12-28 22:40:31 Re: The segmentation fault of Postgresql 9.6.24