From: | Peter Geoghegan <pg(at)heroku(dot)com> |
---|---|
To: | "Jonathan S(dot) Katz" <jonathan(dot)katz(at)excoventures(dot)com> |
Cc: | Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: IS NOT DISTINCT FROM + Indexing |
Date: | 2014-07-21 23:51:32 |
Message-ID: | CAM3SWZTSuzm4V5hJ-P9mGdTk5SS0XjEjCgTJMCP42tBuZhR3gQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Jul 21, 2014 at 4:16 PM, Jonathan S. Katz
<jonathan(dot)katz(at)excoventures(dot)com> wrote:
> With NULLs being indexable, I was wondering if there was some reason why IS NOT DISTINCT FROM could not use the index?
FWIW this works:
postgres=# explain analyze select * from orders where orderid in (5, null);
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
Index Scan using orders_pkey on orders (cost=0.29..12.60 rows=1
width=60) (actual time=0.019..0.021 rows=1 loops=1)
Index Cond: (orderid = ANY ('{5,NULL}'::integer[]))
Planning time: 0.100 ms
Execution time: 0.416 ms
(4 rows)
I think that it would almost be a Simple Matter of Programming to make
IS NOT DISTINCT FROM indexable. Under the hood, IS DISTINCT FROM isn't
very different to using the equality operator:
/*
* DistinctExpr - expression node for "x IS DISTINCT FROM y"
*
* Except for the nodetag, this is represented identically to an OpExpr
* referencing the "=" operator for x and y.
* We use "=", not the more obvious "<>", because more datatypes have "="
* than "<>". This means the executor must invert the operator result.
* Note that the operator function won't be called at all if either input
* is NULL, since then the result can be determined directly.
*/
typedef OpExpr DistinctExpr;
We're already inverting the equals operator. But that isn't
necessarily how a B-Tree index represents equality (that is, a
particular B-Tree operator class could have a non-'=' operator that it
thinks of as equality-ish - in general that could even be the default
B-Tree opclass and there may not be an equals operator). The fact that
most types think of the '=' equals operator as equality is just a
convention, and so technically IS DISTINCT FROM doesn't invert B-Tree
operation 3. See "31.14. Interfacing Extensions To Indexes" for
details. The equals operator '=' isn't really supposed to be magic, it
just is in some places.
Right now the executor is directly inverting the equality operator to
make this work (and has done so since long before NULLs were
indexable). This is a bit of a kludge. I guess it just works that way
because there is no convenient place to insert the special inversion
of the operator, and the special NULL handling that currently appears
within ExecEvalDistinct().
--
Peter Geoghegan
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2014-07-21 23:57:15 | Re: IS NOT DISTINCT FROM + Indexing |
Previous Message | Jonathan S. Katz | 2014-07-21 23:16:33 | IS NOT DISTINCT FROM + Indexing |