From: | Richard Guo <guofenglinux(at)gmail(dot)com> |
---|---|
To: | Andrei Lepikhov <lepihov(at)gmail(dot)com> |
Cc: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: ERROR: corrupt MVNDistinct entry |
Date: | 2024-12-25 09:36:54 |
Message-ID: | CAMbWs4-0zvk_K-0eZp238fu9GNOjdZWZ6-0f2d21EQRegU_D9A@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Dec 25, 2024 at 5:14 PM Richard Guo <guofenglinux(at)gmail(dot)com> wrote:
> On Wed, Dec 25, 2024 at 11:34 AM Andrei Lepikhov <lepihov(at)gmail(dot)com> wrote:
> > 2. It is ok for Vars. But what about expressions? We use equal() in
> > distinct, MCV and dependencies modules. Do we need to remove nulls
> > before using extended statistics as a general rule?
>
> AFAIU, the expressions in extended statistics are not decorated with
> any nullingrels bits, are they?
I've just realized that there may be other places with similar issues,
not just in estimate_num_groups. For instance,
-- after v16
explain (costs on)
select * from t t1 left join t t2 on true where (t2.a+t2.b) is null;
QUERY PLAN
--------------------------------------------------------------------
Nested Loop Left Join (cost=0.00..15032.50 rows=5000 width=16)
Filter: ((t2.a + t2.b) IS NULL)
-> Seq Scan on t t1 (cost=0.00..15.00 rows=1000 width=8)
-> Materialize (cost=0.00..20.00 rows=1000 width=8)
-> Seq Scan on t t2 (cost=0.00..15.00 rows=1000 width=8)
(5 rows)
-- before v16
explain (costs on)
select * from t t1 left join t t2 on true where (t2.a+t2.b) is null;
QUERY PLAN
--------------------------------------------------------------------
Nested Loop Left Join (cost=0.00..15032.50 rows=1 width=16)
Filter: ((t2.a + t2.b) IS NULL)
-> Seq Scan on t t1 (cost=0.00..15.00 rows=1000 width=8)
-> Materialize (cost=0.00..20.00 rows=1000 width=8)
-> Seq Scan on t t2 (cost=0.00..15.00 rows=1000 width=8)
(5 rows)
In v16 and later, the nullingrels within the expression "t2.a + t2.b"
prevent it from being matched to the corresponding expression in
extended statistics, forcing us to use DEFAULT_UNK_SEL(0.005).
It seems that we need to strip out the nullingrels bits from
expressions before matching them to extended statistics or
expressional index columns in more places.
Thanks
Richard
From | Date | Subject | |
---|---|---|---|
Next Message | Japin Li | 2024-12-25 09:44:17 | Fix typo in comment of compute_return_type() |
Previous Message | Michael Paquier | 2024-12-25 09:28:52 | Re: Pass ParseState as down to utility functions. |