From: | David Rowley <dgrowleyml(at)gmail(dot)com> |
---|---|
To: | Tomas Vondra <tomas(at)vondra(dot)me> |
Cc: | tharakan(at)gmail(dot)com, pgsql-bugs(at)lists(dot)postgresql(dot)org, PG Bug reporting form <noreply(at)postgresql(dot)org>, Alexander Korotkov <aekorotkov(at)gmail(dot)com> |
Subject: | Re: BUG #18885: ERROR: corrupt MVNDistinct entry - 2 |
Date: | 2025-04-13 23:26:10 |
Message-ID: | CAApHDvocZCUhM9W9mJ39d6oQz7ePKoqFnao_347mvC-A7QatcQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
On Fri, 11 Apr 2025 at 01:31, Tomas Vondra <tomas(at)vondra(dot)me> wrote:
> I think estimate_multivariate_bucketsize() needs to be more careful
> about building the GroupVarInfo list - in particular, it needs to do the
> dance with examine_variable + add_unique_group_var + pull_var_clause,
> similar to estimate_num_groups() at line ~3532.
This should be documented to prevent future callers of
estimate_multivariate_ndistinct() from falling for this.
The attached aims to do this. I also couldn't resist a few other improvements.
There are a few strange goings-ons in the code itself that I didn't
adjust. For example, in the first "foreach(lc2, *varinfos)" loop after
the "if (stats)", there's a "found" variable that gets set and used
for no apparent reason. I don't see why the "found = true;" doesn't
just "continue;". The variable would only be needed if there was some
inner loop and we couldn't use "continue". I also can't make sense of
the following comment:
/*
* XXX Maybe we should allow searching the expressions even if we
* found an attribute matching the expression? That would handle
* trivial expressions like "(a)" but it seems fairly useless.
*/
Maybe it meant "matching the Var"?
The final loop to build the newlist also looks more complex than it
needs to be. The prior loop over *varinfos could have recorded the
matching GroupVarInfos in the list in a Bitmapset and that final loop
could become:
foreach(lc, *varinfos)
{
if (!bms_is_member(foreach_current_index(lc), matched_varinfos))
newlist = lappend(newlist, lfirst(lc));
}
David
Attachment | Content-Type | Size |
---|---|---|
v1-0001-Improve-comments-for-estimate_multivariate_ndisti.patch | application/octet-stream | 4.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | PG Bug reporting form | 2025-04-14 06:08:13 | BUG #18894: values of JLC_COLLATE and LC_CTYPE in the database have changed from Japanese_Japan.932 to ja-jp |
Previous Message | Tom Lane | 2025-04-13 14:13:31 | Re: BUG #18893: Segfault during analyze pg_database |