From: | Peter Geoghegan <pg(at)bowt(dot)ie> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, jian he <jian(dot)universality(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org, Marcos Pegoraro <marcos(at)f10(dot)com(dot)br>, teodor(at)sigaev(dot)ru, Andrei Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Ranier Vilela <ranier(dot)vf(at)gmail(dot)com> |
Subject: | Re: POC, WIP: OR-clause support for indexes |
Date: | 2024-06-24 16:08:59 |
Message-ID: | CAH2-WzmuJRV59y2LZGrjnNRgzpv57jbfJCdZqEAeDb5Q4fbp9A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Jun 24, 2024 at 11:28 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> > It needs to transform all similar constants to one type, because some constants of "OR" expressions can belong others, like the numeric and int types. Due to the fact that array structure demands that all types must be belonged to one type, so for this reason we applied this procedure.
>
> The alternative that should be considered is not combining things if
> the types don't match. If we're going to combine such things, we need
> to be absolutely certain that type conversion cannot fail.
But what about cases like this:
SELECT * FROM mytable WHERE columna = 1_000_000_000 or columna =
5_000_000_000; -- columna is int4
This is using two types, of course. 1_000_000_000 is int4, while
5_000_000_000 is bigint. If the transformation suddenly failed to work
when a constant above INT_MAX was used for the first time, then I'd
say that that's pretty surprising. That's what happens currently if
you write the same query as "WHERE columna =
any('{1_000_000_000,5_000_000_000}')", due to the way the coercion
works. That seems less surprising to me, because the user is required
to construct their own array, and users expect arrays to always have
one element type.
It would probably be okay to make the optimization not combine
things/not apply when the user gratuitously mixes different syntaxes.
For example, if a numeric constant was used, rather than an integer
constant.
Maybe it would be practical to do something with the B-Tree operator
class for each of the types involved in the optimization. You could
probably find a way for a SAOP to work against a
"heterogeneously-typed array" while still getting B-Tree index scans
-- provided the types all came from the same operator family. I'm
assuming that the index has an index column whose input opclass was a
member of that same family. That would necessitate changing the
general definition of SAOP, and adding new code to nbtree that worked
with that. But that seems doable.
I was already thinking about doing something like this, to support
index scans for "IS NOT DISTINCT FROM", or on constructs like "columna
= 5 OR columna IS NULL". That is more or less a SAOP with two values,
except that one of the values in the value NULL. I've already
implemented "nbtree SAOPs where one of the elements is a NULL" for
skip scan, which could be generalized to support these other cases.
Admittedly I'm glossing over a lot of important details here. Does it
just work for the default opclass for the type, or can we expect it to
work with a non-default opclass when that's the salient opclass (the
one used by our index)? I don't know what you'd do about stuff like
that.
--
Peter Geoghegan
From | Date | Subject | |
---|---|---|---|
Next Message | Nathan Bossart | 2024-06-24 16:11:42 | Re: RFC: Additional Directory for Extensions |
Previous Message | David G. Johnston | 2024-06-24 16:02:46 | Re: [PATCH] Add ACL (Access Control List) acronym |