From: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
---|---|
To: | Alena Rybakina <lena(dot)ribackina(at)yandex(dot)ru>, Andrey Lepikhov <a(dot)lepikhov(at)postgrespro(dot)ru>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Problems with estimating OR conditions, IS NULL on LEFT JOINs |
Date: | 2023-06-28 21:53:32 |
Message-ID: | 63baadec-933b-53f8-5653-99f490dce09e@enterprisedb.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 6/26/23 20:15, Alena Rybakina wrote:
> Hi, all!
>
> On 24.06.2023 14:23, Tomas Vondra wrote:
>> On 6/24/23 02:08, Tom Lane wrote:
>>> Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> writes:
>>>> The problem is that the selectivity for "IS NULL" is estimated using the
>>>> table-level statistics. But the LEFT JOIN entirely breaks the idea that
>>>> the null_frac has anything to do with NULLs in the join result.
>>> Right.
>>>
>>>> I wonder how to improve this, say by adjusting the IS NULL selectivity
>>>> when we know to operate on the outer side of the join. We're able to
>>>> do this for antijoins, so maybe we could do that here, somehow?
>>> This mess is part of the long-term plan around the work I've been doing
>>> on outer-join-aware Vars. We now have infrastructure that can let
>>> the estimator routines see "oh, this Var isn't directly from a scan
>>> of its table, it's been passed through a potentially-nulling outer
>>> join --- and I can see which one". I don't have more than vague ideas
>>> about what happens next, but that is clearly an essential step on the
>>> road to doing better.
>>>
>> I was wondering if that work on outer-join-aware Vars could help with
>> this, but I wasn't following it very closely. I agree the ability to
>> check if the Var could be NULL due to an outer join seems useful, as it
>> says whether applying raw attribute statistics makes sense or not.
>>
>> I was thinking about what to do for the case when that's not possible,
>> i.e. when the Var refers to nullable side of the join. Knowing that this
>> is happening is clearly not enough - we need to know how many new NULLs
>> are "injected" into the join result, and "communicate" that to the
>> estimation routines.
>>
>> Attached is a very ugly experimental patch doing that, and with it the
>> estimate changes to this:
>>
>> QUERY PLAN
>> ----------------------------------------------------------------------
>> Hash Left Join (cost=3.25..18179.88 rows=999900 width=16)
>> (actual time=0.528..596.151 rows=999900 loops=1)
>> Hash Cond: (large.id = small.id)
>> Filter: ((small.id IS NULL) OR
>> (large.a = ANY ('{1000,2000,3000,4000,5000}'::integer[])))
>> Rows Removed by Filter: 100
>> -> Seq Scan on large (cost=0.00..14425.00 rows=1000000 width=8)
>> (actual time=0.069..176.138 rows=1000000 loops=1)
>> -> Hash (cost=2.00..2.00 rows=100 width=8)
>> (actual time=0.371..0.373 rows=100 loops=1)
>> Buckets: 1024 Batches: 1 Memory Usage: 12kB
>> -> Seq Scan on small (cost=0.00..2.00 rows=100 width=8)
>> (actual time=0.032..0.146 rows=100 loops=1)
>> Planning Time: 3.845 ms
>> Execution Time: 712.405 ms
>> (10 rows)
>>
>> Seems nice, but. The patch is pretty ugly, I don't claim it works for
>> other queries or that this is exactly what we should do. It calculates
>> "unmatched frequency" next to eqjoinsel_inner, stashes that info into
>> sjinfo and the estimator (nulltestsel) then uses that to adjust the
>> nullfrac it gets from the statistics.
>>
>> The good thing is this helps even for IS NULL checks on non-join-key
>> columns (where we don't switch to an antijoin), but there's a couple
>> things that I dislike ...
>>
>> 1) It's not restricted to outer joins or anything like that (this is
>> mostly just my laziness / interest in one particular query, but also
>> something the outer-join-aware patch might help with).
>>
>> 2) We probably don't want to pass this kind of information through
>> sjinfo. That was the simplest thing for an experimental patch, but I
>> suspect it's not the only piece of information we may need to pass to
>> the lower levels of estimation code.
>>
>> 3) I kinda doubt we actually want to move this responsibility (to
>> consider fraction of unmatched rows) to the low-level estimation
>> routines (e.g. nulltestsel and various others). AFAICS this just
>> "introduces NULLs" into the relation, so maybe we could "adjust" the
>> attribute statistics (in examine_variable?) by inflating null_frac and
>> modifying the other frequencies in MCV/histogram.
>>
>> 4) But I'm not sure we actually want to do that in these low-level
>> selectivity functions. The outer join essentially produces output with
>> two subsets - one with matches on the outer side, one without them. But
>> the side without matches has NULLs in all columns. In a way, we know
>> exactly how are these columns correlated - if we do the usual estimation
>> (even with the null_frac adjusted), we just throw this information away.
>> And when there's a lot of rows without a match, that seems bad.
>>
>> So maybe we should split the join estimate into two parts, one for each
>> subset of the join result. One for the rows with a match (and then we
>> can just do what we do now, with the attribute stats we already have).
>> And one for the "unmatched part" where we know the values on the outer
>> side are NULL (and then we can easily "fake" stats with null_frac=1.0).
>>
>>
>> I really hope what I just wrote makes at least a little bit of sense.
>>
>>
>> regards
>>
> I am also interested in this problem.
>
> I did some refactoring of the source code in the patch, moved the
> calculation of unmatched_fraction to eqjoinsel_inner.
> I wrote myself in this commit as a co-author, if you don't mind, and I'm
> going to continue working.
>
Sure, if you want to take over the patch and continue working on it,
that's perfectly fine with me. I'm happy to cooperate on it, doing
reviews etc.
>
> On 26.06.2023 12:22, Andrey Lepikhov wrote:
>> On 24/6/2023 17:23, Tomas Vondra wrote:
>>> I really hope what I just wrote makes at least a little bit of sense.
>> Throw in one more example:
>>
>> SELECT i AS id INTO l FROM generate_series(1,100000) i;
>> CREATE TABLE r (id int8, v text);
>> INSERT INTO r (id, v) VALUES (1, 't'), (-1, 'f');
>> ANALYZE l,r;
>> EXPLAIN ANALYZE
>> SELECT * FROM l LEFT OUTER JOIN r ON (r.id = l.id) WHERE r.v IS NULL;
>>
>> Here you can see the same kind of underestimation:
>> Hash Left Join (... rows=500 width=14) (... rows=99999 ...)
>>
>> So the eqjoinsel_unmatch_left() function should be modified for the
>> case where nd1<nd2.
>>
> Unfortunately, this patch could not fix the cardinality calculation in
> this request, I'll try to look and figure out what is missing here.
>
> *postgres=# SELECT i AS id INTO l FROM generate_series(1,100000) i;
> CREATE TABLE r (id int8, v text);
> INSERT INTO r (id, v) VALUES (1, 't'), (-1, 'f');
> ANALYZE l,r;
> EXPLAIN ANALYZE
> SELECT * FROM l LEFT OUTER JOIN r ON (r.id = l.id) WHERE r.v IS NULL;
> SELECT 100000
> CREATE TABLE
> INSERT 0 2
> ANALYZE
> QUERY
> PLAN
> ---------------------------------------------------------------------------------------------------------------
> Hash Left Join (cost=1.04..1819.07 rows=1 width=14) (actual
> time=0.143..114.792 rows=99999 loops=1)
> Hash Cond: (l.id = r.id)
> Filter: (r.v IS NULL)
> Rows Removed by Filter: 1
> -> Seq Scan on l (cost=0.00..1443.00 rows=100000 width=4) (actual
> time=0.027..35.278 rows=100000 loops=1)
> -> Hash (cost=1.02..1.02 rows=2 width=10) (actual time=0.014..0.017
> rows=2 loops=1)
> Buckets: 1024 Batches: 1 Memory Usage: 9kB
> -> Seq Scan on r (cost=0.00..1.02 rows=2 width=10) (actual
> time=0.005..0.007 rows=2 loops=1)
> Planning Time: 0.900 ms
> Execution Time: 126.180 ms
> (10 rows)*
>
>
> As in the previous query, even with applied the patch, the cardinality
> is calculated poorly here, I would even say that it has become worse:
>
> EXPLAIN ANALYZE
> SELECT * FROM large FULL JOIN small ON (large.id = small.id)
> WHERE (large.a IS NULL);
>
> MASTER:
>
> * QUERY
> PLAN
> -----------------------------------------------------------------------------------------------------------------------------------
> Merge Full Join (cost=127921.69..299941.59 rows=56503 width=16)
> (actual time=795.092..795.094 rows=0 loops=1)
> Merge Cond: (small.id = large.id)
> Filter: (large.a IS NULL)
> Rows Removed by Filter: 1000000
> -> Sort (cost=158.51..164.16 rows=2260 width=8) (actual
> time=0.038..0.046 rows=100 loops=1)
> Sort Key: small.id
> Sort Method: quicksort Memory: 29kB
> -> Seq Scan on small (cost=0.00..32.60 rows=2260 width=8)
> (actual time=0.013..0.022 rows=100 loops=1)
> -> Materialize (cost=127763.19..132763.44 rows=1000050 width=8)
> (actual time=363.016..649.103 rows=1000000 loops=1)
> -> Sort (cost=127763.19..130263.31 rows=1000050 width=8)
> (actual time=363.012..481.480 rows=1000000 loops=1)
> Sort Key: large.id
> Sort Method: external merge Disk: 17664kB
> -> Seq Scan on large (cost=0.00..14425.50 rows=1000050
> width=8) (actual time=0.009..111.166 rows=1000000 loops=1)
> Planning Time: 0.124 ms
> Execution Time: 797.139 ms
> (15 rows)*
>
> With patch:
>
> * QUERY
> PLAN
> ----------------------------------------------------------------------------------------------------------------------
> Hash Full Join (cost=3.25..18179.25 rows=999900 width=16) (actual
> time=261.480..261.482 rows=0 loops=1)
> Hash Cond: (large.id = small.id)
> Filter: (large.a IS NULL)
> Rows Removed by Filter: 1000000
> -> Seq Scan on large (cost=0.00..14425.00 rows=1000000 width=8)
> (actual time=0.006..92.827 rows=1000000 loops=1)
> -> Hash (cost=2.00..2.00 rows=100 width=8) (actual
> time=0.032..0.034 rows=100 loops=1)
> Buckets: 1024 Batches: 1 Memory Usage: 12kB
> -> Seq Scan on small (cost=0.00..2.00 rows=100 width=8)
> (actual time=0.008..0.015 rows=100 loops=1)
> Planning Time: 0.151 ms
> Execution Time: 261.529 ms
> (10 rows)
> *
>
> In addition, I found a few more queries, where the estimation of
> cardinality with the patch has become better:
>
>
Yes, this does not surprise me at all - the patch was very early /
experimental code, and I only really aimed it at that single example
query. So don't hesitate to rethink what/how the patch works.
I do think collecting / constructing a wider set of queries with joins
of different types is going to be an important step in working on this
patch and making sure it doesn't break something.
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2023-06-28 21:56:55 | Re: Changing types of block and chunk sizes in memory contexts |
Previous Message | Tomas Vondra | 2023-06-28 21:45:36 | Re: POC, WIP: OR-clause support for indexes |