Re: Allow logical replication to copy tables in binary format

From: Melih Mutlu <m(dot)melihmutlu(at)gmail(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>
Cc: "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, "Takamichi Osumi (Fujitsu)" <osumi(dot)takamichi(at)fujitsu(dot)com>, Euler Taveira <euler(at)eulerto(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Allow logical replication to copy tables in binary format
Date: 2023-03-01 14:28:23
Message-ID: CAGPVpCQxgH1WQCxUE0kts+TajXdS3bp9ohhX=M8F_SZY4J7hNQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, 1 Mar 2023 Çar,
15:02 tarihinde şunu yazdı:

> On Wed, Mar 1, 2023 at 4:47 PM Dilip Kumar <dilipbalaut(at)gmail(dot)com> wrote:
> > I agree with this thought, basically adding an extra option will
> > always complicate things for the user. And logically it doesn't make
> > much sense to copy data in text mode and then stream in binary mode
> > (except in some exception cases and for that, we can always alter the
> > subscription). So IMHO it makes more sense that if the binary option
> > is selected then ideally it should choose to do the initial sync also
> > in the binary mode.
>

I agree that copying in text then streaming in binary does not have a good
use-case.

I think I was suggesting earlier to use a separate option for binary
> table sync copy based on my initial knowledge of binary COPY. Now that
> I have a bit more understanding of binary COPY and subscription's
> existing binary option, +1 for using the same option for table sync
> too.
>
> If used the existing subscription binary option for the table sync,
> there can be following possibilities for the users:
> 1. users might want to enable the binary option for table sync and
> disable it for subsequent replication
> 2. users might want to enable the binary option for both table sync
> and for subsequent replication
> 3. users might want to disable the binary option for table sync and
> enable it for subsequent replication
> 4. users might want to disable binary option for both table sync and
> for subsequent replication
>
> Binary copy use-cases are a bit narrower compared to the existing
> subscription binary option, it works only if:
> a) the column data types have appropriate binary send/receive functions
> b) not replicating between different major versions or different platforms
> c) both publisher and subscriber tables have the exact same column
> types (not when replicating from smallint to int or numeric to int8
> and so on)
> d) both publisher and subscriber supports COPY with binary option
>
> Now if one enabled the binary option for table sync, that means, they
> must have ensured all (a), (b), (c), and (d) are met. The point is if
> one decides to use binary copy for table sync, it means that the
> subsequent binary replication works too without any problem. If
> required, one can disable it for normal replication i.e. post-table
> sync.
>

That was my intention in the beginning with this patch. Then the new option
also made some sense at some point, and I added copy_binary option
according to reviews.
The earlier versions of the patch didn't have that. Without the new option,
this patch would also be smaller.

But before changing back to the point where these are all tied to binary
option without a new option, I think we should decide if that's really the
ideal way to do it.
I believe that the patch is all good now with the binary_copy option which
is not tied to anything, explanations in the doc and separate tests etc.
But I also agree that binary=true should make everything in binary and
binary=false should do them in text format. It makes more sense.

Best,
--
Melih Mutlu
Microsoft

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2023-03-01 14:33:00 Re: Add LZ4 compression in pg_dump
Previous Message Jeroen Vermeulen 2023-03-01 14:23:45 Re: libpq: PQgetCopyData() and allocation overhead