Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays

From: James Coleman <jtc331(at)gmail(dot)com>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays
Date: 2020-04-28 18:24:54
Message-ID: CAAaqYe-rj-tfJzNU8C+C6xoNomVCtvg_oi0St2djh4=Xvzm8OQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 28, 2020 at 1:40 PM Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>
>
>
> út 28. 4. 2020 v 18:17 odesílatel James Coleman <jtc331(at)gmail(dot)com> napsal:
>>
>> On Tue, Apr 28, 2020 at 11:18 AM Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> wrote:
>> >
>> >
>> >
>> > út 28. 4. 2020 v 16:48 odesílatel Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> napsal:
>> >>
>> >> On Tue, Apr 28, 2020 at 03:43:43PM +0200, Pavel Stehule wrote:
>> >> >út 28. 4. 2020 v 15:26 odesílatel Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
>> >> >napsal:
>> >> >
>> >> >> ...
>> >> >>
>> >> >> >I'm not so concerned about this in any query where we have a real FROM
>> >> >> >clause because even if we end up with only one row, the relative
>> >> >> >penalty is low, and the potential gain is very high. But simple
>> >> >> >expressions in pl/pgsql, for example, are a case where we can know for
>> >> >> >certain (correct me if I've wrong on this) that we'll only execute the
>> >> >> >expression once, which means there's probably always a penalty for
>> >> >> >choosing the implementation with setup costs over the default linear
>> >> >> >scan through the array.
>> >> >> >
>> >> >>
>> >> >> What do you mean by "simple expressions"? I'm not plpgsql expert and I
>> >> >> see it mostly as a way to glue together SQL queries, but yeah - if we
>> >> >> know a given ScalarArrayOpExpr will only be executed once, then we can
>> >> >> disable this optimization for now.
>> >> >>
>> >> >
>> >> >a := a + 1
>> >> >
>> >> >is translated to
>> >> >
>> >> >SELECT $1 + 1 and save result to var a
>> >> >
>> >> >The queries like this "SELECT $1 + 1" are simple expressions. They are
>> >> >evaluated just on executor level - it skip SPI
>> >> >
>> >> >the simple expression has not FROM clause, and have to return just one row.
>> >> >I am not sure if it is required, it has to return just one column.
>>
>> Yes, this is what I meant by simple expressions.
>>
>> >> >I am not sure if executor knows so expression is executed as simply
>> >> >expressions. But probably it can be deduced from context
>> >> >
>> >>
>> >> Not sure. The executor state is created by exec_eval_simple_expr, which
>> >> calls ExecInitExprWithParams (and it's the only caller). And that in
>> >> turn is the only place that leaves (state->parent == NULL). So maybe
>> >> that's a way to identify simple (standalone) expressions? Otherwise we
>> >> might add a new EEO_FLAG_* to identify these expressions explicitly.
>>
>> I'll look into doing one of these.
>>
>> >> I wonder if it would be possible to identify cases when the expression
>> >> is executed in a loop, e.g. like this:
>> >>
>> >> FOR i IN 1..1000 LOOP
>> >> x := y IN (1, 2, ..., 999);
>> >> END LOOP;
>> >>
>> >> in which case we only build the ScalarArrayOpExpr once, so maybe we
>> >> could keep the hash table for all executions. But maybe that's not
>> >> possible or maybe it's pointless for other reasons. It sure looks a bit
>> >> like trying to build a query engine from FOR loop.
>> >
>> >
>> > Theoretically it is possible, not now. But I don't think so it is necessary. I cannot to remember this pattern in any plpgsql code and I never seen any request on this feature.
>> >
>> > I don't think so this is task for plpgsql engine. Maybe for JIT sometimes.
>>
>> Agreed. I'd thought about this kind of scenario when I brought this
>> up, but I think solving it would the responsibility of the pg/pgsql
>> compiler rather than the expression evaluation code, because it'd have
>> to recognize the situation and setup a shared expression evaluation
>> context to be reused each time through the loop.
>
>
> can be nice if new implementation was not slower then older in all environments and context (including plpgsql expressions)

Agreed, which is why I'm going to look into preventing using the new
code path for simple expressions.

James

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Jonah H. Harris 2020-04-28 19:38:09 Re: Proposing WITH ITERATIVE
Previous Message Juan José Santamaría Flecha 2020-04-28 18:14:55 Re: PG compilation error with Visual Studio 2015/2017/2019