From: | Richard Guo <guofenglinux(at)gmail(dot)com> |
---|---|
To: | PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | ERROR: corrupt MVNDistinct entry |
Date: | 2024-12-24 08:00:57 |
Message-ID: | CAMbWs4-2Z4k+nFTiZe0Qbu5n8juUWenDAtMzi98bAZQtwHx0-w@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I ran into this error in estimate_multivariate_ndistinct, and it can
be reproduced with the query below.
create table t (a int, b int);
insert into t select 1, 1;
create statistics s (ndistinct) on a, b from t;
analyze;
explain select 1 from t t1
left join (select a c1, coalesce(a) c2 from t t2) s on true
group by s.c1, s.c2;
ERROR: corrupt MVNDistinct entry
And the first bad commit is:
2489d76c4906f4461a364ca8ad7e0751ead8aa0d is the first bad commit
commit 2489d76c4906f4461a364ca8ad7e0751ead8aa0d
Author: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Date: Mon Jan 30 13:16:20 2023 -0500
Make Vars be outer-join-aware.
So in this query, there are two grouping expressions: s.c1 is Var t2.a
with nullingrels set to {3}; s.c2 is a PHV with nullingrels also being
{3}, and its contained expression is Var t2.a with empty nullingrels.
This eventually leads to estimate_num_groups creating two separate
GroupVarInfos for Var t2.a: one with nullingrels {3}, and another with
empty nullingrels. As a result, estimate_multivariate_ndistinct
incorrectly assumes there are two matching expressions. When it later
fails to find the exact match for the combination, it mistakenly
concludes that there is a corrupt MVNDistinct entry.
It seems to me that when estimating the number of groups, we do not
need to concern ourselves with the outer joins that could null the
Vars/PHVs contained in the grouping expressions, and we should not
count the same Var more than once.
So I wonder if we can fix this issue by removing the nullingrels
within the grouping expressions first in estimate_num_groups. Such as:
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -121,6 +121,7 @@
#include "parser/parse_clause.h"
#include "parser/parse_relation.h"
#include "parser/parsetree.h"
+#include "rewrite/rewriteManip.h"
#include "statistics/statistics.h"
#include "storage/bufmgr.h"
#include "utils/acl.h"
@@ -3446,6 +3447,10 @@ estimate_num_groups(PlannerInfo *root, List
*groupExprs, double input_rows,
if (groupExprs == NIL || (pgset && *pgset == NIL))
return 1.0;
+ groupExprs = (List *) remove_nulling_relids((Node *) groupExprs,
+ root->outer_join_rels,
+ NULL);
+
/*
* Count groups derived from boolean grouping expressions. For other
* expressions, find the unique Vars used, treating an expression as a Var
Any thoughts?
Thanks
Richard
From | Date | Subject | |
---|---|---|---|
Next Message | torikoshia | 2024-12-24 08:53:06 | RFC: Allow EXPLAIN to Output Page Fault Information |
Previous Message | Michael Paquier | 2024-12-24 07:38:51 | Re: Recovering from detoast-related catcache invalidations |