From: | Merlin Moncure <mmoncure(at)gmail(dot)com> |
---|---|
To: | Kim Rose Carlsen <krc(at)hiper(dot)dk> |
Cc: | Scott Marlowe <scott(dot)marlowe(at)gmail(dot)com>, "pgsql-general(at)postgresql(dot)org" <pgsql-general(at)postgresql(dot)org> |
Subject: | Re: How to hint 2 coulms IS NOT DISTINCT FROM each other |
Date: | 2016-10-31 13:19:15 |
Message-ID: | CAHyXU0wkB+xZYXNqfN4PVWPfA5-hJ7UwOk=YOLsJsUZG0N8EHw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Sat, Oct 29, 2016 at 8:27 AM, Kim Rose Carlsen <krc(at)hiper(dot)dk> wrote:
>>> This doesn't do much good. This doesn't tell the planner that the 3
>
>>> customer_ids are actually of same value, and it therefore can't filter
>>> them
>>> as it sees fit.
>
>> You do know you can index on a function, and the planner then keeps
>> stats on it when you run analyze right?
>
> Yes, but I don't think it will make any difference. I don't think I can
> solve this with
> an index lookup. I think my savior is the inference that the 2 columns are
> of
> same value and the planner are free to choose which order to do the filter
> and join
> with this extra information.
>
> I have tried creating a function called
> zero_if_null(int) : int that just select COALESCE($1, 0)
> and adding a index on (zero_if_null(customer_id)) on table that contains
> customer_id. The only thing I get from is the planner now only knows how to
> compare customer_id, but it still doesn't know that they are of same value,
> only I know that and I want to declare it for the planner.
>
> I could probably rewrite the whole view in one query, and then fix it with a
> proper index. But I think I will loose alot of readability.
>
> I could also change the structure to save an explicit state, instead of a
> calculated state. But then I get some redundancy I need to make sure always
> stays the same.
>
> In the end one of these will probably be the solution.
>
> I guess the question is more or less,
>
> why doesn't IS NOT DISTINCT FROM behave the same way as = operator, are
> there any alternatives? And a plausible use case for when it would be
> useful.
Well, the *behavior* is mandated by the sql standard. Our
implementation is slow however. I'm surprised the attached function
didn't help, it can be inlined and I was able to get bitmap or which
is pretty good. As I said upthread I think INDF could theoretically
run as fast as equality -- it just doesn't today.
As your joins are written you could probably convert this by reserving
a customer_id to the work that you're trying to do with null, say, 0,
or -1. This is a somewhat dubious practice but seems a better fit for
your use case. I don't think INDF is good in this usage.
merlin
From | Date | Subject | |
---|---|---|---|
Next Message | Daevor The Devoted | 2016-10-31 13:24:15 | If pg_ctl dies, can we attach a new pg_ctl process to postgres? |
Previous Message | Melvin Davidson | 2016-10-31 13:14:07 | Re: Way to quickly detect if database tables/columns/etc. were modified? |