From: | Justin Pryzby <pryzby(at)telsasoft(dot)com> |
---|---|
To: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
Cc: | Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: PoC/WIP: Extended statistics on expressions |
Date: | 2021-03-24 16:42:26 |
Message-ID: | 20210324164226.GB15100@telsasoft.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Mar 24, 2021 at 01:24:46AM -0500, Justin Pryzby wrote:
> It seems like you're preferring to use pluralized "statistics" in a lot of
> places that sound wrong to me. For example:
> > Currently the first statistics wins, which seems silly.
> I can write more separately, but I think this is resolved and clarified if you
> write "statistics object" and not just "statistics".
In HEAD:catalogs.sgml, pg_statistic_ext (the table) says "object":
|Name of the statistics object
|Owner of the statistics object
|An array of attribute numbers, indicating which table columns are covered by this statistics object;
But pg_stats_ext (the view) doesn't say "object", which sounds wrong:
|Name of extended statistics
|Owner of the extended statistics
|Names of the columns the extended statistics is defined on
Other pre-existing issues: should be singular "statistic":
doc/src/sgml/perform.sgml: Another type of statistics stored for each column are most-common value
doc/src/sgml/ref/psql-ref.sgml: The status of each kind of extended statistics is shown in a column
Pre-existing issues: doesn't say "object" but I think it should:
src/backend/commands/statscmds.c: errmsg("statistics creation on system columns is not supported")));
src/backend/commands/statscmds.c: errmsg("cannot have more than %d columns in statistics",
src/backend/commands/statscmds.c: * If we got here and the OID is not valid, it means the statistics does
src/backend/commands/statscmds.c: * Select a nonconflicting name for a new statistics.
src/backend/commands/statscmds.c: * Generate "name2" for a new statistics given the list of column names for it
src/backend/statistics/extended_stats.c: /* compute statistics target for this statistics */
src/backend/statistics/extended_stats.c: * attributes the statistics is defined on, and then the default statistics
src/backend/statistics/mcv.c: * The input is the OID of the statistics, and there are no rows returned if
should say "for a statistics object" or "for statistics objects"
src/backend/statistics/extended_stats.c: * target for a statistics objects (from the object target, attribute targets
Your patch adds these:
Should say "object":
+ * Check if we actually have a matching statistics for the expression.
+ /* evaluate expressions (if the statistics has any) */
+ * for the extended statistics. The second option seems more reasonable.
+ * the statistics had all options enabled on the original version.
+ * But if the statistics is defined on just a single column, it has to
+ /* has the statistics expressions? */
+ /* expression - see if it's in the statistics */
+ * column(s) the statistics depends on. Also require all
+ * statistics is defined on more than one column/expression).
+ * statistics is useless, but harmless).
+ * If there are no simply-referenced columns, give the statistics an auto
+ * Then the first statistics matches no expressions and 3 vars,
+ * while the second statistics matches one expression and 1 var.
+ * Currently the first statistics wins, which seems silly.
+ * [(a+c), d]. But maybe it's better than failing to match the
+ * second statistics?
I can make patches for these (separate patches for HEAD and your patch), but I
don't think your patch has to wait on it, since the user-facing documentation
is consistent with what's already there, and the rest are internal comments.
--
Justin
From | Date | Subject | |
---|---|---|---|
Next Message | Robert Haas | 2021-03-24 16:43:16 | Re: [HACKERS] Custom compression methods |
Previous Message | Tomas Vondra | 2021-03-24 16:38:32 | Re: PoC/WIP: Extended statistics on expressions |