Re: Support logical replication of DDLs

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>, Japin Li <japinli(at)hotmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, rajesh singarapu <rajesh(dot)rs0541(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Zheng Li <zhengli10(at)gmail(dot)com>
Subject: Re: Support logical replication of DDLs
Date: 2022-08-05 06:03:29
Message-ID: CAHut+PvpGCmqEodeGf8R=ajFmf-Bac_LPHeN1iPb2gAxXwzn3A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Hi Hou-san, here are my review comments for the patch v15-0001:

======

1. Commit Message

CREATE/ALTER/DROP TABLE (*)

At first, I thought "(*)" looks like a SQL syntax element.

SUGGESTION:

CREATE/ALTER/DROP TABLE - - Note #1, Note #2
...
Note #1 – blah blah
Note #2 – yada yada

======

2. src/backend/commands/ddl_deparse.c - General

2.1
Lots of the deparse_XXX function are in random places scattered around
in this module. Since there are so many, I think it's better to have
functions arrange alphabetically to make them easier to find. (And if
there are several functions that logically "belong" together then
those should be re-named so they will be group together
alphabetically...

Same applies to other functions – not just the deparse_XXX ones

2.2
There are lots of 'tmp' (or 'tmp2') variables in this file. Sometimes
'tmp' is appropriate (or maybe 'tmpobj' would be better) but in other
cases it seems like there should be a better name than 'tmp'. Please
search all these and replace where you can use a more meaningful name
than tmp.

2.3
Pointer NULL comparisons are not done consistently all through the
file. E.g. Sometimes you do tree->fmtinfo == NULL, but in others you
do like if (!tree->fmtinfo). It's no big deal whichever way you want
to use, but at least for the comparisons involving the same variables
IMO should use the same style consistently.

~~~

3. src/backend/commands/ddl_deparse.c - format_type_detailed

3.1
+ * - typename is set to the type name, without quotes

But the param is called 'typname', not 'typename'

3.2
+ * - typmod is set to the typemod, if any, as a string with parens

I think you mean:
"typmod is set" -> "typemodstr is set"

3.3
+ if (IsTrueArrayType(typeform) &&
+ typeform->typstorage != TYPSTORAGE_PLAIN)
+ {
+ /* Switch our attention to the array element type */
+ ReleaseSysCache(tuple);
+ tuple = SearchSysCache1(TYPEOID, ObjectIdGetDatum(array_base_type));
+ if (!HeapTupleIsValid(tuple))
+ elog(ERROR, "cache lookup failed for type %u", type_oid);
+
+ typeform = (Form_pg_type) GETSTRUCT(tuple);
+ type_oid = array_base_type;
+ *typarray = true;
+ }
+ else
+ *typarray = false;

Maybe this if/else can be simplified

*typarray = IsTrueArrayType(typeform) && typeform->typstorage !=
TYPSTORAGE_PLAIN;
if (*typarray)
{
...
}

3.4
+ /* otherwise, WITH TZ is added by typmod. */

Uppercase comment

~~~

4. src/backend/commands/ddl_deparse.c - append_object_to_format_string

+ for (cp = sub_fmt; cp < end_ptr; cp++)
+ {
+ if (*cp == '{')
+ {
+ start_copy = true;
+ continue;
+ }

What's this logic going to do if it encounters "{{" - it looks like it
will just keep going but wouldn't that be a name error to have a "{"
in it?

~~~

5. src/backend/commands/ddl_deparse.c - append_bool_object

+append_bool_object(ObjTree *tree, char *sub_fmt, bool value)
+{
+ ObjElem *param;
+ char *object_name = sub_fmt;
+ bool is_present_flag = false;
+
+ Assert(sub_fmt);
+
+ if (strcmp(sub_fmt, "present") == 0)
+ {
+ is_present_flag = true;
+ tree->present = value;
+ }
+
+ if (!verbose && !tree->present)
+ return;
+
+ if (!is_present_flag)
+ object_name = append_object_to_format_string(tree, sub_fmt);
+
+ param = new_object(ObjTypeBool, object_name);
+ param->value.boolean = value;
+ append_premade_object(tree, param);
+}

It feels like there is some subtle trickery going on here with the
conditions. Is there a simpler way to write this, or maybe it is just
lacking some explanatory comments?

~~~

6. src/backend/commands/ddl_deparse.c - append_array_object

+ if (!verbose)
+ {
+ ListCell *lc;
+
+ /* Extract the ObjElems whose present flag is true */
+ foreach(lc, array)
+ {
+ ObjElem *elem = (ObjElem *) lfirst(lc);
+
+ Assert(elem->objtype == ObjTypeObject);
+
+ if (!elem->value.object->present)
+ array = foreach_delete_current(array, lc);
+ }
+
+ if (list_length(array) == 0)
+ return;
+ }

Maybe it is OK as-is. I'm just wondering if this list_length(array)
check should be outside of the !verbose check?

~~~

7. src/backend/commands/ddl_deparse.c - objtree_to_jsonb_element

+ case ObjTypeObject:
+ /* recursively add the object into the existing parse state */

Uppercase comment

~~~

8. src/backend/commands/ddl_deparse.c - new_objtree_for_qualname_id

8.1
+ *
+ * Elements "schemaname" and "objname" are set. If the object is a temporary
+ * object, the schema name is set to "pg_temp".

I'm not sure if this is the right place to say this, since it is not
really this function that sets that "pg_temp".

8.2
+ if (isnull)
+ elog(ERROR, "unexpected NULL namespace");
+ objname = heap_getattr(catobj, Anum_name, RelationGetDescr(catalog),
+ &isnull);

Missing blank line after the elog?

~~~

9. src/backend/commands/ddl_deparse.c - deparse_ColumnIdentity

+ /* Definition elemets */

typo "elemets"

~~~

10. src/backend/commands/ddl_deparse.c - deparse_CreateTrigStmt

10.1
+ else
+ elog(ERROR, "unrecognized trigger timing value %d", node->timing);

should that say "unrecognized trigger timing type" (e.g. type instead of value)

10.2
+ /*
+ * Decode the events that the trigger fires for. The output is a list;
+ * in most cases it will just be a string with the even name, but when
+ * there's an UPDATE with a list of columns, we return a JSON object.
+ */

"even name" -> "event name" ?

10.3
+ foreach(cell, node->columns)
+ {
+ char *colname = strVal(lfirst(cell));
+
+ cols = lappend(cols,
+ new_string_object(colname));
+ }

Unnecessary wrapping?

10.4
+ append_array_object(update, "%{columns:, }I", cols);
+
+ events = lappend(events,
+ new_object_object(update));

Unnecessary wrapping?

10.5
+ /* verify that the argument encoding is correct */

Uppercase comment

~~~

11. src/backend/commands/ddl_deparse.c - deparse_ColumnDef

+ saw_notnull = false;
+ foreach(cell, coldef->constraints)
+ {
+ Constraint *constr = (Constraint *) lfirst(cell);
+
+ if (constr->contype == CONSTR_NOTNULL)
+ saw_notnull = true;
+ }

Some similar functions here would 'break' when it finds the
saw_notnull. Why not break here?

~~~

12. src/backend/commands/ddl_deparse.c - obtainConstraints

12.1
+ /* only one may be valid */

Uppercase comment

12.2
+ /*
+ * scan pg_constraint to fetch all constraints linked to the given
+ * relation.
+ */

Uppercase comment

~~~

13. src/backend/commands/ddl_deparse.c - deparse_InhRelations

+/*
+ * Deparse the inherits relations.
+ *
+ * Given a table OID, return a schema qualified table list representing
+ * the parent tables.
+ */

I am not sure - should that say "Deparse the INHERITS relations." (uppercase)

~~~

14. src/backend/commands/ddl_deparse.c - pg_get_indexdef_detailed

14.1
+ * There is a huge lot of code that's a dupe of pg_get_indexdef_worker, but
+ * control flow is different enough that it doesn't seem worth keeping them
+ * together.
+ */

SUGGESTION
A large amount of code is duplicated from pg_get_indexdef_worker, ...

14.2
+ idxrelrec = (Form_pg_class) GETSTRUCT(ht_idxrel);
+
+
+ /*
+ * Fetch the pg_am tuple of the index' access method
+ */

Remove the extra blank line

~~~

15. src/backend/commands/ddl_deparse.c - deparse_IndexStmt

+ /*
+ * indexes for PRIMARY KEY and other constraints are output
+ * separately; return empty here.
+ */

Uppercase comment

~~~

16. src/backend/commands/ddl_deparse.c - deparse_FunctionSet

16.1
+static ObjTree *
+deparse_FunctionSet(VariableSetKind kind, char *name, char *value)

Missing function comment.

16.2
+ ObjTree *r;

Is 'r' the best name?

16.3
+ if (kind == VAR_RESET_ALL)
+ {
+ r = new_objtree("RESET ALL");
+ }
+ else if (value != NULL)
+ {
+ r = new_objtree_VA("SET %{set_name}I", 1,
+ "set_name", ObjTypeString, name);
+
+ /*
+ * Some GUC variable names are 'LIST' type and hence must not be
+ * quoted.
+ */
+ if (GetConfigOptionFlags(name, true) & GUC_LIST_QUOTE)
+ append_string_object(r, "TO %{set_value}s", value);
+ else
+ append_string_object(r, "TO %{set_value}L", value);
+ }
+ else
+ {
+ r = new_objtree("RESET");
+ append_string_object(r, "%{set_name}I", name);
+ }

It seems a bit strange that the kind of new_objtree is judged
sometimes by the *value. Why isn't this just always switching on the
different VariableSetKind?

~~~

17. src/backend/commands/ddl_deparse.c - deparse_CreateFunction

17.1
+/*
+ * deparse_CreateFunctionStmt
+ * Deparse a CreateFunctionStmt (CREATE FUNCTION)
+ *
+ * Given a function OID and the parsetree that created it, return the JSON
+ * blob representing the creation command.
+ */
+static ObjTree *
+deparse_CreateFunction(Oid objectId, Node *parsetree)

The name of the function and the name of the function in the comment
don't match. Suggest removing it from the comment.

17.2
+ /* get the pg_proc tuple */

Uppercase comment

17.3
+ /* get the corresponding pg_language tuple */

Uppercase comment

17.4
+ /* optional wholesale suppression of "name" occurs here */

Uppercase comment

17.5
+ append_string_object(createFunc, "%{volatility}s",
+ procForm->provolatile == PROVOLATILE_VOLATILE ?
+ "VOLATILE" :
+ procForm->provolatile == PROVOLATILE_STABLE ?
+ "STABLE" :
+ procForm->provolatile == PROVOLATILE_IMMUTABLE ?
+ "IMMUTABLE" : "INVALID VOLATILITY");

Does "INVALID VOLATILITY" make sense? Is that a real thing or should
this give ERROR?

17.6
+ foreach(cell, node->options)
+ {
+ DefElem *defel = (DefElem *) lfirst(cell);
+ ObjTree *tmp = NULL;

Why assign *tmp = NULL? I think it serves no purpose.

~~~

18. src/backend/commands/ddl_deparse.c - deparse_AlterFunction

18.1
+ * Deparse a AlterFunctionStmt (ALTER FUNCTION)

"a" -> "an"

18.2
+ /* get the pg_proc tuple */

Uppercase comment

18.3
+ alterFunc = new_objtree_VA("ALTER FUNCTION", 0);
+
+ params = NIL;

Why not just assign params = NIL at the declaration?

~~~

19. src/backend/commands/ddl_deparse.c - deparse_AlterOwnerStmt

+ * Deparse a AlterOwnerStmt (ALTER ... OWNER TO ...).

"a" -> "an"

~~~

20. src/backend/commands/ddl_deparse.c - deparse_AlterOperatorStmt

+ * Deparse a AlterOperatorStmt (ALTER OPERATOR ... SET ...).

"a" -> "an"

~~~

21. src/backend/commands/ddl_deparse.c - deparse_RenameStmt

21.1
+ * In a ALTER .. RENAME command, we don't have the original name of the

"a" -> "an"

21.2
+ relation_close(relation, AccessShareLock);
+
+ break;
+ case OBJECT_SCHEMA:

Misplaced bank line; should be before after break;

21.3
+ break;
+ case OBJECT_TRIGGER:

Put blank line after break;

21.4
+ break;
+ default:

Put blank line after break;

~~~

22. src/backend/commands/ddl_deparse.c - deparse_Seq_Cache

+static inline ObjElem *
+deparse_Seq_Cache(ObjTree *parent, Form_pg_sequence seqdata, bool alter_table)

Is param 'alter_table’ correct here or should that be 'alter_sequence'
(or just 'alter')?

~~

23. src/backend/commands/ddl_deparse.c - deparse_Seq_Cycle

+static inline ObjElem *
+deparse_Seq_Cycle(ObjTree *parent, Form_pg_sequence seqdata, bool alter_table)

Is param 'alter_table’ correct here or should that be 'alter_sequence'
(or just 'alter')?

~~~

24. src/backend/commands/ddl_deparse.c - deparse_Seq_IncrementBy

+static inline ObjElem *
+deparse_Seq_IncrementBy(ObjTree *parent, Form_pg_sequence seqdata,
bool alter_table)

Is param 'alter_table’ correct here or should that be 'alter_sequence'
(or just 'alter')?

~~~

25. src/backend/commands/ddl_deparse.c - deparse_Seq_Minvalue

+static inline ObjElem *
+deparse_Seq_Minvalue(ObjTree *parent, Form_pg_sequence seqdata, bool
alter_table)

Is param 'alter_table’ correct here or should that be 'alter_sequence'
(or just 'alter')?

~~~

26. src/backend/commands/ddl_deparse.c - deparse_Seq_Maxvalue

+static inline ObjElem *
+deparse_Seq_Maxvalue(ObjTree *parent, Form_pg_sequence seqdata, bool
alter_table)

Is param 'alter_table’ correct here or should that be 'alter_sequence'
(or just 'alter')?

~~~

27. src/backend/commands/ddl_deparse.c - deparse_Seq_Startwith

+static inline ObjElem *
+deparse_Seq_Startwith(ObjTree *parent, Form_pg_sequence seqdata, bool
alter_table)

Is param 'alter_table’ correct here or should that be 'alter_sequence'
(or just 'alter')?

~~~

28. src/backend/commands/ddl_deparse.c - deparse_CreateSchemaStmt

+/*
+ * Deparse a CreateSchemaStmt.
+ *
+ * Given a schema OID and the parsetree that created it, return an ObjTree
+ * representing the creation command.
+ *
+ * Note we don't output the schema elements given in the creation command.
+ * They must be output separately. (In the current implementation,
+ * CreateSchemaCommand passes them back to ProcessUtility, which will lead to
+ * this file if appropriate.)
+ */

"this file" ??

~~~

29. src/backend/commands/ddl_deparse.c - deparse_Seq_OwnedBy

29.1
+static ObjElem *
+deparse_Seq_OwnedBy(ObjTree *parent, Oid sequenceId, bool alter_table)

Missing function comment.

29.2
+ /* only consider AUTO dependencies on pg_class */

Uppercase comment.

29.3
+ if (!ownedby)
+ /* XXX this shouldn't happen ... */
+ ownedby = new_objtree_VA("OWNED BY %{owner}D",
+ 3,
+ "clause", ObjTypeString, "owned",
+ "owner", ObjTypeNull,
+ "present", ObjTypeBool, false);
+ return new_object_object(ownedby);

Put blank line before return;

~~~

30. src/backend/commands/ddl_deparse.c - deparse_CreateSeqStmt

30.1
+ /* definition elemets */

Uppercase comment, and typo "elemets"

30.2
+ /* we purposefully do not emit OWNED BY here */

Uppercase comment

~~~

31. src/backend/commands/ddl_deparse.c - deparse_AlterTableStmt

31.1
+ tmp = new_objtree_VA("ADD CONSTRAINT %{name}I",
+ 2,
+ "type", ObjTypeString, "add constraint using index",
+ "name", ObjTypeString, get_constraint_name(constrOid));

I think it was customary for you to put the number of varags on the
1st line, not like this. There are several others like this in this
function which should also be changed (where they fit OK on the first
line).

31.2
+ append_array_object(alterTableStmt, "%{subcmds:, }s", subcmds);
+ return alterTableStmt;

Maybe add blank line before the return;

~~~

32. src/backend/commands/ddl_deparse.c - deparse_AlterOpFamily

32.1
+ list = NIL;
+ foreach(cell, cmd->d.opfam.operators)

Why not assign list = NIL with the declaration?

32.2
+ proargtypes = procForm->proargtypes.values;
+ arglist = NIL;
+ for (i = 0; i < procForm->pronargs; i++)

Why not assign arglist = NIL with the declaration?

32.3
+ if (!stmt->isDrop)
+ append_format_string(alterOpFam, "ADD");
+ else
+ append_format_string(alterOpFam, "DROP");

Maybe simpler to reverse these; IMO isDrop means "DROP" makes more sense.

~~~

33. src/backend/commands/ddl_deparse.c - deparse_DefineStmt_Operator

+ /* Add the definition clause */
+ list = NIL;

Why not assign list = NIL with the declaration?

~~~

34. src/backend/commands/ddl_deparse.c - deparse_DefineStmt

+ default:
+ elog(ERROR, "unsupported object kind");
+ return NULL;

What purpose does this return serve after the ERROR? If you have to do
something to quieten the compiler, maybe it's better to set defStmt =
NULL at declaration.

======

35. src/backend/commands/ddl_json.c - <General>

In the errmsg text some of the messages say JSON (uppercase) and some
say json (lowercase). IMO they should all be consistent. Maybe say
JSON, since that way seems dominant.

~~~

36. src/backend/commands/ddl_json.c - enum

+typedef enum
+{
+ tv_absent,
+ tv_true,
+ tv_false
+} trivalue;

It seems there is another enum elsewhere already called this, because
you did not add it into the typedefs.list, yet it is already there. Is
that OK? Maybe this should have a unique name for this module.

~~~

37. src/backend/commands/ddl_json.c - expand_fmt_recursive

37.1
+ is_array = false;
+
+ ADVANCE_PARSE_POINTER(cp, end_ptr);

Why not assign is_array = false in the declaration?

37.2
+ /*
+ * Validate that we got an array if the format string specified one.
+ * And finally print out the data
+ */
+ if (is_array)
+ expand_jsonb_array(buf, param, value, arraysep, specifier, start_ptr);
+ else
+ expand_one_jsonb_element(buf, param, value, specifier, start_ptr);

"Print out" the data? This comment seems a bit over-complicated.
Perhaps these sentences can be combined and re-worded a bit.

SUGGESTION (maybe?)
Expand the data (possibly an array) into the output StringInfo.

~~~

38. src/backend/commands/ddl_json.c - expand_jsonval_identifier

+/*
+ * Expand a json value as an identifier. The value must be of type string.
+ */
+static void
+expand_jsonval_identifier(StringInfo buf, JsonbValue *jsonval)

Should that say "as a quoted identifier" ?

~~~

39. src/backend/commands/ddl_json.c - expand_jsonval_typename

39.1
+ switch (is_array)
+ {
+ default:
+ case tv_absent:

It seems slightly unusual for the default case to not be the last
switch case. Consider rearranging it.

39.2
+ if (schema == NULL)
+ appendStringInfo(buf, "%s%s%s",
+ quote_identifier(typename),
+ typmodstr ? typmodstr : "",
+ array_decor);
+
+ /* Special typmod needs */
+ else if (schema[0] == '\0')
+ appendStringInfo(buf, "%s%s%s",
+ typename,
+ typmodstr ? typmodstr : "",
+ array_decor);
+ else
+ appendStringInfo(buf, "%s.%s%s%s",
+ quote_identifier(schema),
+ quote_identifier(typename),
+ typmodstr ? typmodstr : "",
+ array_decor);

The last 2 parts:
typmodstr ? typmodstr : "",
array_decor);

are common for all those above appendStringInfo, so you could reduce
the code (if you want to) and just add the common parts at the end.

e.g.

if (schema == NULL)
appendStringInfo(buf, "%s", quote_identifier(typename));
else if (schema[0] == '\0')
appendStringInfo(buf, "%s", typename); /* Special typmod needs */
else
appendStringInfo(buf, "%s.%s", quote_identifier(schema),
quote_identifier(typename));

appendStringInfo(buf, "%s%s", typmodstr ? typmodstr : "", array_decor);

39.3
In other code (e.g. expand_jsonval_dottedname) you did lots of
pfree(str) after using the strings, so why not similar here?

~~~

40. src/backend/commands/ddl_json.c - expand_jsonval_operator

40.1
+ /* schema might be NULL or empty */

Uppercase comment

40.2
Why no pfree(str) here similar to what there was in prior code (e.g.
expand_jsonval_dottedname)?

~~~

41. src/backend/commands/ddl_json.c - expand_jsonval_string

Comment says "The value must be of type string or of type object."

Yeah, but what it is isn't? This code will just fall thru and return
true. Is that the right behaviour? Should there be an Assert at least?

~~~

42. src/backend/commands/ddl_json.c - expand_jsonval_number

Does this need some pfree after the string is copied to 'buf'?

~~~

43 src/backend/commands/ddl_json.c - expand_jsonval_role

+ rolename = find_string_in_jsonbcontainer(jsonval->val.binary.data,
+ "rolename", false, NULL);
+ appendStringInfoString(buf, quote_identifier(rolename));

Does this need some pfree after the string is copied to 'buf'?

~~~

44. src/backend/commands/ddl_json.c - ddl_deparse_json_to_string

+ d = DirectFunctionCall1(jsonb_in,
+ PointerGetDatum(json_str));

Seems unnecessary wrapping here.

~~~

45. src/backend/commands/ddl_json.c - fmtstr_error_callback

45.1
+/*
+ * Error context callback for JSON format string expansion.
+ *
+ * Possible improvement: indicate which element we're expanding, if applicable.
+ */

Should that "Possible improvement" comment have "XXX" prefix like most
other possible improvement comments have?

45.2
+fmtstr_error_callback(void *arg)
+{
+ errcontext("while expanding format string \"%s\"", (char *) arg);
+
+}

Remove the blank line.

======

46. src/backend/utils/adt/ruleutils.c - pg_get_trigger_whenclause

+char *
+pg_get_trigger_whenclause(Form_pg_trigger trigrec, Node *whenClause,
bool pretty)

Missing function comment

~~~

47. src/backend/utils/adt/ruleutils.c - print_function_sqlbody

@@ -3513,7 +3526,7 @@ pg_get_function_arg_default(PG_FUNCTION_ARGS)
PG_RETURN_TEXT_P(string_to_text(str));
}

-static void
+void
print_function_sqlbody(StringInfo buf, HeapTuple proctup)
{

Having a function comment is more important now that this is no longer static.

------
Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Danny Shemesh 2022-08-05 09:50:53 Expr. extended stats are skipped with equality operator
Previous Message Julien Rouhaud 2022-08-05 02:37:39 Re: sequence id overflow ERROR using timescaledb

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2022-08-05 06:50:32 Re: collate not support Unicode Variation Selector
Previous Message Peter Geoghegan 2022-08-05 05:40:44 Re: BTMaxItemSize seems to be subtly incorrect