From: | "Jamison, Kirk" <k(dot)jamison(at)jp(dot)fujitsu(dot)com> |
---|---|
To: | 'Tomas Vondra' <tomas(dot)vondra(at)2ndquadrant(dot)com> |
Cc: | Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | RE: Multivariate MCV list vs. statistics target |
Date: | 2019-07-26 07:03:41 |
Message-ID: | D09B13F772D2274BB348A310EE3027C65177D0@g01jpexmbkw24 |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, July 20, 2019 8:12 AM (GMT+9), Tomas Vondra wrote:
> >+ /* XXX What if the target is set to 0? Reset the statistic?
> */
> >
> >This also makes me wonder. I haven't looked deeply into the code, but
> >since 0 is a valid value, I believe it should reset the stats.
>
> I agree resetting the stats after setting the target to 0 seems quite
> reasonable. But that's not what we do for attribute stats, because in that
> case we simply skip the attribute during the future ANALYZE runs - we don't
> reset the stats, we keep the existing stats. So I've done the same thing here,
> and I've removed the XXX comment.
>
> If we want to change that, I'd do it in a separate patch for both the regular
> and extended stats.
Hi, Tomas
Sorry for my late reply.
You're right. I have no strong opinion whether we'd want to change that behavior.
I've also confirmed the change in the patch where setting statistics target 0
skips the statistics.
Maybe only some minor nitpicks in the source code comments below:
1. "it's" should be "its":
> + * Compute statistic target, based on what's set for the statistic
> + * object itself, and for it's attributes.
2. Consistency whether you'd use either "statistic " or "statisticS ".
Ex. statistic target vs statisticS target, statistics object vs statistic object, etc.
> Attached is v4 of the patch, with a couple more improvements:
>
> 1) I've renamed the if_not_exists flag to missing_ok, because that's more
> consistent with the "IF EXISTS" clause in the grammar (the old flag was kinda
> the exact opposite), and I've added a NOTICE about the skip.
+ bool missing_ok; /* do nothing if statistics does not exist */
Confirmed. So we ignore if statistic does not exist, and skip the error.
Maybe to make it consistent with other data structures in parsernodes.h,
you can change the comment of missing_ok to:
/* skip error if statistics object does not exist */
> 2) I've renamed ComputeExtStatsTarget to ComputeExtStatsRows, because that's
> what the function was doing anyway (computing sample size).
>
> 3) I've added a couple of regression tests to stats_ext.sql
>
> Aside from that, I've cleaned up a couple of places and improved a bunch of
> comments. Nothing huge.
I have a question though regarding ComputeExtStatisticsRows.
I'm just curious with the value 300 when computing sample size.
Where did this value come from?
+ /* compute sample size based on the statistic target */
+ return (300 * result);
Overall, the patch is almost already in good shape for commit.
I'll wait for the next update.
Regards,
Kirk Jamison
From | Date | Subject | |
---|---|---|---|
Next Message | Julien Rouhaud | 2019-07-26 07:36:32 | Re: Add parallelism and glibc dependent only options to reindexdb |
Previous Message | Amit Kapila | 2019-07-26 06:55:15 | Re: POC: Cleaning up orphaned files using undo logs |