Re: Possible improvement

From: Paul van der Linden <paul(dot)doskabouter(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Possible improvement
Date: 2020-06-05 18:39:17
Message-ID: CAEC-EqDuDsS-vvifDDsvco_CGT9W75dMx03ideV=BnwWaAaHmA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Thanks for your thoughts.

For the case where it isn't known if the case expression itself is indexed,
technically that should be added as a decision-node in the query planner.
After all there are 2 possibilities to handle that so it should be up to
the planner to choose the cheapest.

Having said that, if the time spent planning the query is *that* critical I
agree that it probably isn't worth it.
Just that in my line of work the execution time of a query is a lot of
orders of magnitude larger than the planning time (my recordholder is a
query that runs for just over 3 days...)

On Fri, Jun 5, 2020 at 4:31 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> David Rowley <dgrowleyml(at)gmail(dot)com> writes:
> > On Fri, 5 Jun 2020 at 14:41, Paul van der Linden
> > <paul(dot)doskabouter(at)gmail(dot)com> wrote:
> >> If I have a query like:
> >>
> >> SELECT * FROM (
> >> SELECT
> >> CASE
> >> WHEN field='value1' THEN 1
> >> WHEN field='value2' THEN 2
> >> END AS category
> >> FROM table1
> >> ) AS foo
> >> WHERE category=1
> >>
> >> doesn't use the index on field, while technically it could do that.
> >> Is it hard to implement drilling down the constant in the WHERE to
> within the CASE?
>
> > It doesn't look impossible to improve that particular case. See
> > eval_const_expressions_mutator() in clauses.c at T_CaseExpr. However,
> > this would need to take constant folding further than we take it
> > today. Today we just have the ability to simplify expressions which
> > are, by themselves, an expression which will always evaluate to a
> > constant value. This case is more complex as it requires something
> > outside the CASE expr to allow the simplification to take place. In
> > this case, we'd need to look at the other side of the OpExpr to see
> > the const there before any transformation could simplify it.
>
> I'd tend to see this as a transformation rule that acts on equality-
> with-a-CASE-input, thereby avoiding the "action at a distance" problem.
>
> > It's
> > also not entirely clear that the simplification would always be a good
> > idea. What, for example if there was an index on the case statement
> > but none on "field". The query may perform worse!
>
> FWIW, I'm not too fussed about that objection. If we rejected new
> optimizations on the basis that somebody's optimized-for-the-old-way
> query might perform worse, almost no planner changes would ever get in.
> I think most people would feel that an optimization like this is an
> improvement. (I recall coming across a similar case in an
> information_schema query just a few days ago.) The hard questions
> I would ask are
> 1. Is the transformation actually correct?
> 2. Does it improve queries often enough to be worth the planning cycles
> expended to look for the optimization?
>
> As far as #1 goes, note that this CASE produces NULL if "field" is
> neither 'value1' nor 'value2', whereupon the equality operator would
> also produce NULL, so that simplifying to "field='value1'" is not
> formally correct: that would produce FALSE not NULL for other values
> of "field". We can get away with the replacement anyway at the top
> level of WHERE, but not in other contexts. Hence, it'd be wrong to
> try to make this transformation in eval_const_expressions(), which is
> applied to all expressions. Possibly prepqual.c's canonicalize_qual()
> would be a better place.
>
> The real problem here is going to be objection #2. The rules under
> which any optimization could be applied are nontrivial, so that we'd
> spend quite a bit of time trying to figure out whether the optimization
> applies ... and I'm afraid that most of the time it would not.
>
> regards, tom lane
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2020-06-05 19:02:34 Re: Possible improvement
Previous Message Tom Lane 2020-06-05 18:35:26 Re: Postgres12 - Confusion with pg_restore