Re: pglogical_output - a general purpose logical decoding output plugin

From: Tomasz Rybak <tomasz(dot)rybak(at)post(dot)pl>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: pglogical_output - a general purpose logical decoding output plugin
Date: 2016-01-24 15:13:52
Message-ID: 20160124151352.1350.34827.pgcf@coridan.postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

The following review has been posted through the commitfest application:
make installcheck-world: not tested
Implements feature: not tested
Spec compliant: not tested
Documentation: not tested

Final part of review:
protocol.txt

+|origin_identifier|signed char[origin_identifier_length]|An origin identifier of arbitrary, upstream-application-defined structure. _Should_ be text in the same encoding as the upstream database. NULL-terminated. _Should_ be 7-bit ASCII.

Does it need NULL-termination when previous field contains length of origin_identifier?
Similarly for relation metadata message.

+ metadata message. All consecutive row messages must currently have the same
+ relidentifier. (_Later extensions to add metadata caching will relax these
+ requirements for clients that advertise caching support; see the documentation
+ on metadata messages for more detail_).

Shouldn't this be changed as metadata cache is implemented?

+ |relidentifier|uint32|relidentifier that matches the table metadata message sent for this row.
+ (_Not present in BDR, which sends nspname and relname instead_)

and

+ |natts|uint16|Number of fields sent in this tuple part.
+ (_Present in BDR, but meaning significantly different here)_

Is BDR mention relevant here? It was not mentioned anywhere else, and now appears
ex machina.

Long quote - but required.

+ ==== Tuple fields
+
+ |===
+ |Tuple type|signed char|Identifies the kind of tuple being sent.
+
+ |tupleformat|signed char|‘**T**’ (0x54)
+ |natts|uint16|Number of fields sent in this tuple part.
+ (_Present in BDR, but meaning significantly different here)_
+ |[tuple field values]|[composite]|
+ |===
+
+ ===== Tuple tupleformat compatibility
+
+ Unrecognised _tupleformat_ kinds are a protocol error for the downstream.
+
+ ==== Tuple field value fields
+
+ These message parts describe individual fields within a tuple.
+
+ There are two kinds of tuple value fields, abbreviated and full. Which is being
+ read is determined based on the first field, _kind_.
+
+ Abbreviated tuple value fields are nothing but the message kind:
+
+ |===
+ |*Message*|*Type/Size*|*Notes*
+
+ |kind|signed char| * ‘**n**’ull (0x6e) field
+ |===
+
+ Full tuple value fields have a length and datum:
+
+ |===
+ |*Message*|*Type/Size*|*Notes*
+
+ |kind|signed char| * ‘**i**’nternal binary (0x62) field
+ |length|int4|Only defined for kind = i\|b\|t
+ |data|[length]|Data in a format defined by the table metadata and column _kind_.
+ |===
+
+ ===== Tuple field values kind compatibility
+
+ Unrecognised field _kind_ values are a protocol error for the downstream. The
+ downstream may not continue processing the protocol stream after this
+ point**.**
+
+ The upstream may not send ‘**i**’nternal or ‘**b**’inary format values to the
+ downstream without the downstream negotiating acceptance of such values. The
+ downstream will also generally negotiate to receive type information to use to
+ decode the values. See the section on startup parameters and the startup
+ message for details.

I do not fully get it.
For each tuple we are supposed to have "Tuple type" (which is kind?). Does it
mean that T1 might be sent using "i" kind and T2 sent using "b" kind?
At the same tme we have kind "n" (null) - but it belongs to field level
(one field might be null, not entire tuple).

In other words - do we have "i" and then "T" and then number of attributes,
or "T', then number of attributes, then "i" or "b" or "n" for each of attributes?

Also - description of "b" seems missing.

+ Before sending changed rows for a relation, a metadata message for the relation
+ must be sent so the downstream knows the namespace, table name, column names,
+ optional column types, etc. A relidentifier field, an arbitrary numeric value
+ unique for that relation on that upstream connection, maps the metadata to
+ following rows.
+
+ A client should not assume that relation metadata will be followed immediately
+ (or at all) by rows, since future changes may lead to metadata messages being
+ delivered at other times. Metadata messages may arrive during or between
+ transactions.
+
+ The upstream may not assume that the downstream retains more metadata than the
+ one most recent table metadata message. This applies across all tables, so a
+ client is permitted to discard metadata for table x when getting metadata for
+ table y. The upstream must send a new metadata message before sending rows for
+ a different table, even if that metadata was already sent in the same session
+ or even same transaction. _This requirement will later be weakened by the
+ addition of client metadata caching, which will be advertised to the upstream
+ with an output plugin parameter._

This needs reworking while metadata caching is supported

+ |Message type|signed char|‘**S**’ (0x53) - startup
+ |Startup message version|uint8|Value is always “1”.

Value is "1" for the current plugin version. It is represented in code as
PGLOGICAL_STARTUP_MESSAGE_VERSION_NUM.

+ |startup_params_format|int8|1|The format version of this startup parameter set. Always the digit 1 (0x31), null terminated.

int8 suggests binary value, and here we are sending ASCII which is NULL-terminated.
It's a bit inconsistent.

Please wrap long lines in the last part of protocol.txt file, starting with
"Arguments client supplies to output plugn" section.

Also - please put 3 paragraphs from your email from 2016-01-07 15:50
(staring with "but this isn't just about replication") into README.
This is really good rationale for this plugin existence.

Best regards.

The new status of this patch is: Waiting on Author

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2016-01-24 15:25:00 Re: Re: pglogical_output - a general purpose logical decoding output plugin
Previous Message Greg Stark 2016-01-24 14:59:12 Re: Releasing in September