Re: Problems with estimating OR conditions, IS NULL on LEFT JOINs

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Problems with estimating OR conditions, IS NULL on LEFT JOINs
Date: 2023-06-24 00:08:34
Message-ID: 104950.1687565314@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> writes:
> The problem is that the selectivity for "IS NULL" is estimated using the
> table-level statistics. But the LEFT JOIN entirely breaks the idea that
> the null_frac has anything to do with NULLs in the join result.

Right.

> I wonder how to improve this, say by adjusting the IS NULL selectivity
> when we know to operate on the outer side of the join. We're able to
> do this for antijoins, so maybe we could do that here, somehow?

This mess is part of the long-term plan around the work I've been doing
on outer-join-aware Vars. We now have infrastructure that can let
the estimator routines see "oh, this Var isn't directly from a scan
of its table, it's been passed through a potentially-nulling outer
join --- and I can see which one". I don't have more than vague ideas
about what happens next, but that is clearly an essential step on the
road to doing better.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2023-06-24 00:12:11 Re: psql: Add role's membership options to the \du+ command
Previous Message Cary Huang 2023-06-23 21:31:11 Re: sslinfo extension - add notbefore and notafter timestamps