From: | Julien Rouhaud <rjuju123(at)gmail(dot)com> |
---|---|
To: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
Cc: | Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Joel Jacobson <joel(at)compiler(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Schema variables - new implementation for Postgres 15 |
Date: | 2022-01-19 08:01:17 |
Message-ID: | 20220119080117.4c2fjzs5p7vtdxhk@jrouhaud |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On Tue, Jan 18, 2022 at 10:01:01PM +0100, Pavel Stehule wrote:
> pá 14. 1. 2022 v 3:44 odesílatel Julien Rouhaud <rjuju123(at)gmail(dot)com> napsal:
>
> >
> > =# let myvariable = 'AA';
> > LET
> >
> > =# select 'AA' collate "en-x-icu" < myvariable;
> > ?column?
> > ----------
> > f
> > (1 row)
> >
> > =# select 'AA' collate "en-x-icu" < myvariable collate mycollation;
> > ERROR: 42P21: collation mismatch between explicit collations "en-x-icu"
> > and "mycollation"
> > LINE 1: select 'AA' collate "en-x-icu" < myvariable collate mycollat...
> >
>
> What do you expect? I don't understand collating well, but it looks
> correct. Minimally the tables have the same behavior.
Indeed, I actually didn't know that such object's collation were implicit and
could be overloaded without a problem as long as there's no conflict between
all the explicit collations. So I agree that the current behavior is ok,
including a correct handling for wanted conflicts:
=# create variable var1 text collate "fr-x-icu";
CREATE VARIABLE
=# create variable var2 text collate "en-x-icu";
CREATE VARIABLE
=# let var1 = 'hoho';
LET
=# let var2 = 'hoho';
LET
=# select var1 < var2;
ERROR: 42P22: could not determine which collation to use for string comparison
HINT: Use the COLLATE clause to set the collation explicitly.
> Please, can you check the attached patches?
All the issue I mentioned are fixed, thanks!
I see a few problems with the other new features added though. The new
session_variables_ambiguity_warning GUC is called even in contexts where it
shouldn't apply. For instance:
=# set session_variables_ambiguity_warning = 1;
SET
=# create variable v text;
CREATE VARIABLE
=# DO $$
DECLARE v text;
BEGIN
v := 'test';
RAISE NOTICE 'v: %', v;
END;
$$ LANGUAGE plpgsql;
WARNING: 42702: session variable "v" is shadowed by column
LINE 1: v := 'test'
^
DETAIL: The identifier can be column reference or session variable reference.
HINT: The column reference is preferred against session variable reference.
QUERY: v := 'test'
But this "v := 'test'" shouldn't be a substitute for a LET, and it indeed
doesn't work:
=# DO $$
BEGIN
v := 'test';
RAISE NOTICE 'v: %', v;
END;
$$ LANGUAGE plpgsql;
ERROR: 42601: "v" is not a known variable
LINE 3: v := 'test';
But the RAISE NOTICE does see the session variable (which should be the correct
behavior I think), so the warning should have been raised for this instruction
(and in that case the message is incorrect, as it's not shadowing a column).
Also, the pg_dump handling emits a COLLATION option for session variables even
for default collation, while it should only emit it if the collation is not the
type's default collation. As a reference, for attributes the SQL used is:
"CASE WHEN a.attcollation <> t.typcollation "
"THEN a.attcollation ELSE 0 END AS attcollation,\n"
Also, should \dV or \dV+ show the collation?
And a few comments on the new chunks in this version of the patch (I didn't
look in detail at the whole patch yet):
+ <para>
+ The session variables can be overshadowed by columns in an query. When query
+ holds identifier or qualified identifier that can be used as session variable
+ identifier and as column identifier too, then it is used as column identifier
+ every time. This situation can be logged by enabling configuration
+ parameter <xref linkend="guc-session-variables-ambiguity-warning"/>.
+ </para>
Is "overshadowed" correct? The rest of the patch only says "shadow(ed)".
While at it, here's some proposition to improve the phrasing:
+ The session variables can be shadowed by column references in a query. When a
+ query contains identifiers or qualified identifiers that could be used as both
+ a session variable identifiers and as column identifier, then the column
+ identifier is preferred every time. Warnings can be emitted when this situation
+ happens by enabling configuration parameter <xref
+ linkend="guc-session-variables-ambiguity-warning"/>.
Similarly, the next documentation could be rephrased to:
+ When on, a warning is raised when any identifier in a query could be used as both
+ a column identifier or a session variable identifier.
+ The default is <literal>off</literal>.
Few other nitpicking:
+ * If we really detect collision of column and variable identifier,
+ * then we prefer column, because we don't want to allow to break
+ * an existing valid queries by new variable.
s/an existing/existing
+-- it is ambigonuous, but columns are preferred
ambiguous?
@@ -369,6 +367,19 @@ VariableCreate(const char *varName,
/* dependency on extension */
recordDependencyOnCurrentExtension(&myself, false);
+ /*
+ * Normal dependency from a domain to its collation. We know the default
+ * collation is pinned, so don't bother recording it.
+ */
+ if (OidIsValid(varCollation) &&
+ varCollation != DEFAULT_COLLATION_OID)
The comment mentions domains rather than session variables.
And for the initial patch, while looking around I found this comment on
fix_alternative_subplan():
@@ -1866,7 +1969,9 @@ fix_alternative_subplan(PlannerInfo *root, AlternativeSubPlan *asplan,
* replacing Aggref nodes that should be replaced by initplan output Params,
* choosing the best implementation for AlternativeSubPlans,
* looking up operator opcode info for OpExpr and related nodes,
- * and adding OIDs from regclass Const nodes into root->glob->relationOids.
+ * and adding OIDs from regclass Const nodes into root->glob->relationOids,
+ * and replacing PARAM_VARIABLE paramid, that is the oid of the session variable
+ * to offset the array by query used session variables. ???
I don't really understand the comment, and the "???" looks a bit suspicious.
I'm assuming it's a reference to this new behavior in fix_param_node():
* fix_param_node
* Do set_plan_references processing on a Param
+ * Collect session variables list and replace variable oid by
+ * index to collected list.
*
* If it's a PARAM_MULTIEXPR, replace it with the appropriate Param from
* root->multiexpr_params; otherwise no change is needed.
* Just for paranoia's sake, we make a copy of the node in either case.
+ *
+ * If it's a PARAM_VARIABLE, then we should to calculate paramid.
Some improvement on the comments would be welcome there, probably including
some mention to the "glob->sessionVariables" collected list?
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2022-01-19 08:13:18 | Re: pg_upgrade should truncate/remove its logs before running |
Previous Message | Amit Langote | 2022-01-19 07:59:00 | Re: a misbehavior of partition row movement (?) |