Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: James Coleman <jtc331(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays
Date: 2021-04-08 06:50:29
Message-ID: CAApHDvrw21K-Os3FKobrNvvJ0yp+eq6N7OET2QnwxGS=LvOyzg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 8 Apr 2021 at 05:54, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
> I only ran that with a single client, the machine only has 4 cores and
> this should not be related to concurrency, so 1 client seems fine. The
> average of 10 runs, 15 seconds each look like this:

Thanks for running these tests. The reason I added so much
concurrency was that this AMD machine has some weird behaviour in
regards to power management. Every bios update I seem to get changes
the power management but it still is very unstable despite me having
it on the most optimal settings in the bios. Working it a bit harder
seems to help make it realise that there might be some urgency.

> simple prepared 10/s 10/p 100/s 100/p
> -------------------------------------------------------------
> master 21847 59476 23343 59380 11757 56488
> v4 21546 57757 22864 57704 11572 57350
> v4+v5 23374 56089 24410 56140 14765 55302
>
> The first two columns are your bench.sql, with -M simple or prepared.
> The other columns are 10 or 100 values, /s is simple, /p is prepared.
>
> Compared to master:
>
> simple prepared 10/s 10/p 100/s 100/p
> -------------------------------------------------------------
> v4 98.62% 97.11% 97.95% 97.18% 98.43% 101.52%
> v4+v5 106.99% 94.31% 104.57% 94.54% 125.59% 97.90%
>
> That seems to mostly match your observation - there's a small
> performance hit (~2%), although that might be due to changes in the
> layout of the binary. And v4+v5 improves that a bit (even compared to
> master), although I don't see the same 20% speedup.

I've spent more time hacking at this patch. I had a bit of a change
of heart earlier about having this new HashedScalarArrayOpExpr node
type. There were more places that I imagined that I needed to add
handling for it. For example, partprune.c needed to know about it to
allow partition pruning on them. While supporting that is just a few
lines to make a recursive call passing in the underlying
ScalarArrayOpExpr, I just didn't like the idea.

Instead, I think it'll be better just to add a new field to
ScalarArrayOpExpr and have the planner set that to tell the executor
that it should use a hash table to perform the lookups rather than a
linear search. This can just be the hash function oid, which also
saves the executor from having to look that up.

After quite a bit of hacking, I've ended up with the attached. I
added the required JIT code to teach the jit code about
EEOP_HASHED_SCALARARRAYOP.

I also wrote the missing regression test for non-strict equality ops
and moved the declaration for the simplehash.h code into
execExprInterp.c and forward declared ScalarArrayOpExprHashTable in
exprExpr.h. I also rewrote a large number of comments and fixed a few
things like missing permission checks for the hash function.

I've not done any further performance tests yet but will start those now.

David

Attachment Content-Type Size
saop_hash_v6.patch application/octet-stream 33.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-04-08 06:58:02 Re: SQL-standard function body
Previous Message Kohei KaiGai 2021-04-08 06:48:15 Re: TRUNCATE on foreign table