Re: anti-join with small table via text/varchar cannot estimate rows correctly

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Stefan Andreatta <s(dot)andreatta(at)synedra(dot)com>, "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: anti-join with small table via text/varchar cannot estimate rows correctly
Date: 2017-03-02 00:28:28
Message-ID: CAKFQuwYCXF1oOk6x-af+2WFEHrgD-sGFSyQqM3KPYDOKkiwEag@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Wed, Mar 1, 2017 at 5:24 PM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:

> On Wed, Mar 1, 2017 at 2:12 PM, David G. Johnston <
> david(dot)g(dot)johnston(at)gmail(dot)com> wrote:
>
>> On Wed, Mar 1, 2017 at 3:00 PM, Stefan Andreatta <s(dot)andreatta(at)synedra(dot)com
>> > wrote:
>>
>>> plain analyze
>>> select tmp_san_1.id
>>> from tmp_san_1
>>> left join tmp_san_2 on tmp_san_1.text = tmp_san_2.text
>>> where tmp_san_2.id is null;
>>>
>>> ​Does it help if you check for "tmp_san_2.text is null"?
>>
>>
>>
> Yes. And if you swap it so that the left join is on the integer while IS
> NULL is on the text, that also gets poorly estimated. Also, if you make
> both column of both tables be integers, same thing--you get bad estimates
> when the join condition refers to one column and the where refers to the
> other. I don't know why the estimate is poor, but it is not related to the
> types of the columns, but rather the identities of them.
>
>
​I suspect it has to with the lack of a NOT NULL constraint on either
column causing the planner to disregard the potential to implement a LEFT
JOIN using ANTI-JOIN semantics - or, also possible - the form itself is
invalid regardless of the presence or absence of contraints. IIUC, while a
true anti-join syntax doesn't exist the canonical form for one uses NOT
EXISTS - which would force the author to use only the correct column pair.

David J.

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Tom Lane 2017-03-02 01:06:58 Re: anti-join with small table via text/varchar cannot estimate rows correctly
Previous Message Jeff Janes 2017-03-02 00:24:14 Re: anti-join with small table via text/varchar cannot estimate rows correctly