Re: Allow logical replication to copy tables in binary format

From: Peter Smith <smithpb2250(at)gmail(dot)com>
To: Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, "Takamichi Osumi (Fujitsu)" <osumi(dot)takamichi(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>
Subject: Re: Allow logical replication to copy tables in binary format
Date: 2023-03-16 00:03:14
Message-ID: CAHut+Pu2jUgR+EQff-EVH4v2_r2NgCox-=LL+nkwrQn7h3iyAg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Here are some review comments for v15-0001

======
doc/src/sgml/logical-replication.sgml

1.
+ target table. However, logical replication in binary format is
more restrictive,
+ see <literal>binary</literal> option of
+ <link linkend="sql-createsubscription"><command>CREATE
SUBSCRIPTION</command></link>
+ for more details.

IMO (and Chat-GPT agrees) the new text should be 2 sentences.

Also, I changed "more details" --> "details" because none are provided here,.

SUGGESTION
However, logical replication in <literal>binary</literal> format is
more restrictive. See the binary option of <link
linkend="sql-createsubscription"><command>CREATE
SUBSCRIPTION</command></link> for details.

======
doc/src/sgml/ref/alter_subscription.sgml

2.
+ <para>
+ See <literal>binary</literal> option of <xref
linkend="sql-createsubscription"/>
+ for details of copying pre-existing data in binary format.
+ </para>

Should the link should be defined more like you did above using the
<command> markup to get the better font?

SUGGESTION (also minor rewording)
See the <literal>binary</literal> option of <link
linkend="sql-createsubscription"><command>CREATE
SUBSCRIPTION</command></link> for details about copying pre-existing
data in binary format.

======
doc/src/sgml/ref/create_subscription.sgml

3.
<para>
- Specifies whether the subscription will request the publisher to
- send the data in binary format (as opposed to text).
- The default is <literal>false</literal>.
- Even when this option is enabled, only data types having
- binary send and receive functions will be transferred in binary.
+ Specifies whether the subscription will request the publisher to send
+ the data in binary format (as opposed to text). The default is
+ <literal>false</literal>. Any initial table synchronization copy
+ (see <literal>copy_data</literal>) also uses the same format. Binary
+ format can be faster than the text format, but it is less portable
+ across machine architectures and PostgreSQL versions. Binary format
+ is very data type specific; for example, it will not allow copying
+ from a smallint column to an integer column, even though that would
+ work fine in text format. Even when this option is enabled, only data
+ types having binary send and receive functions will be transferred in
+ binary. Note that the initial synchronization requires all data types
+ to have binary send and receive functions, otherwise the
synchronization
+ will fail (see <xref linkend="sql-createtype"/> for more about
+ send/receive functions).
</para>

IMO that part saying "from a smallint column to an integer column"
should have <type></type> markups for "smallint" and "integer".

======
src/backend/replication/logical/tablesync.c

4.
+ /*
+ * The binary option for replication is supported since v14
+ */
+ if (walrcv_server_version(LogRepWorkerWalRcvConn) >= 140000 &&
+ MySubscription->binary)
+ {
+ appendStringInfo(&cmd, " WITH (FORMAT binary)");
+ options = lappend(options, makeDefElem("format", (Node *)
makeString("binary"), -1));
+ }

Should this now be a single-line comment instead of spanning 3 lines?

======
src/test/subscription/t/014_binary.pl

5.
Everything looked OK to me, but the TAP file has only small comments
for each test step, which forces you to read everything from
top-to-bottom to understand what is going on. I felt it might be
easier to understand the tests if you add a few "bigger" comments just
to break the tests into the categories being tested. For example,
something like:

# ------------------------------------------------------
# Ensure binary mode also executes COPY in binary format
# ------------------------------------------------------

~

# --------------------------------------
# Ensure normal binary replication works
# --------------------------------------

~

# ------------------------------------------------------------------------------
# Use ALTER SUBSCRIPTION to change to text format and then back to binary format
# ------------------------------------------------------------------------------

~

# ---------------------------------------------------------------
# Test binary replication without and with send/receive functions
# ---------------------------------------------------------------

~

# ----------------------------------------------
# Test different column orders on pub/sub tables
# ----------------------------------------------

~

# -----------------------------------------------------
# Test mismatched column types with/without binary mode
# -----------------------------------------------------

------
Kind Regards,
Peter Smith.
Fujitsu Australia

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2023-03-16 00:07:33 Re: Progress report of CREATE INDEX for nested partitioned tables
Previous Message Alvaro Herrera 2023-03-15 22:44:40 Re: cataloguing NOT NULL constraints