Re: How to hint 2 coulms IS NOT DISTINCT FROM each other

From: Kim Rose Carlsen <krc(at)hiper(dot)dk>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org>
Subject: Re: How to hint 2 coulms IS NOT DISTINCT FROM each other
Date: 2016-10-31 14:28:12
Message-ID: AM4PR0501MB2610D420071792C3F2D72E03C7AE0@AM4PR0501MB2610.eurprd05.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Sat, Oct 29, 2016 at 8:27 AM, Kim Rose Carlsen <krc(at)hiper(dot)dk> wrote:

> > I have tried creating a function called
> > zero_if_null(int) : int that just select COALESCE($1, 0)
> > and adding a index on (zero_if_null(customer_id)) on table that contains
> > customer_id. The only thing I get from is the planner now only knows how to
> > compare customer_id, but it still doesn't know that they are of same value,
> > only I know that and I want to declare it for the planner.

> Well, the *behavior* is mandated by the sql standard. Our
> implementation is slow however.

Sorry I'm not following, what behavior is mandated by the sql standard?

> I'm surprised the attached function
> didn't help, it can be inlined and I was able to get bitmap or which
> is pretty good. As I said upthread I think INDF could theoretically
> run as fast as equality -- it just doesn't today.

It might be harsh to say that it doesn't help at all. I does half the running time,
but I need it to run an order of magnitude faster. Here is the plan with the
empty_if_null (customer_id is actually varchar)

https://explain.depesz.com/s/M1LV with empty_if_null + functional index
https://explain.depesz.com/s/eOL with indf

> As your joins are written you could probably convert this by reserving
> a customer_id to the work that you're trying to do with null, say, 0,
> or -1. This is a somewhat dubious practice but seems a better fit for
> your use case. I don't think INDF is good in this usage.
>
> merlin

This will work well, I think.

But I'm not sure I can mentally accept an unfilled value should not be
null (eg. 0, '', '0000-01-01'). But I can see how the equals operator will
work well with this.

It might raise another problem, that the nulls are generated through LEFT
JOINS where now rows are defined. Then the 0 or -1 value need to be
a computed value. Won't this throw of index lookups? (I might be
more confused in this area).

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2016-10-31 14:29:43 Re: Way to quickly detect if database tables/columns/etc. were modified?
Previous Message Melvin Davidson 2016-10-31 14:17:49 Re: Way to quickly detect if database tables/columns/etc. were modified?