Re: Handle infinite recursion in logical replication setup

From: "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
To: vignesh C <vignesh21(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "shiy(dot)fnst(at)fujitsu(dot)com" <shiy(dot)fnst(at)fujitsu(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, "houzj(dot)fnst(at)fujitsu(dot)com" <houzj(dot)fnst(at)fujitsu(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "kuroda(dot)hayato(at)fujitsu(dot)com" <kuroda(dot)hayato(at)fujitsu(dot)com>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Subject: Re: Handle infinite recursion in logical replication setup
Date: 2023-01-10 02:43:15
Message-ID: 957e1b95-7dcd-a880-3cf4-7a2f7b7cbe97@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 9/12/22 1:23 AM, vignesh C wrote:
> On Fri, 9 Sept 2022 at 11:12, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>>
>> On Thu, Sep 8, 2022 at 9:32 AM vignesh C <vignesh21(at)gmail(dot)com> wrote:
>>>
>>>
>>> The attached patch has the changes to handle the same.
>>>
>>
>> Pushed. I am not completely sure whether we want the remaining
>> documentation patch in this thread in its current form or by modifying
>> it. Johnathan has shown some interest in it. I feel you can start a
>> separate thread for it to see if there is any interest in the same and
>> close the CF entry for this work.
>
> Thanks for pushing the patch. I have closed this entry in commitfest.
> I will wait for some more time and see the response regarding the
> documentation patch and then start a new thread if required.

I've been testing this patch in advancing of working on the
documentation and came across a behavior I wanted to note. Specifically,
I am hitting a deadlock while trying to synchronous replicate between
the two instances at any `synchronous_commit` level above `local`.

Here is my set up. I have two instances, "A" and "B".

On A and B, run:

CREATE TABLE sync (id int PRIMARY KEY, info float);
CREATE PUBLICATION sync FOR TABLE sync;

On A, run:

CREATE SUBSCRIPTION sync
CONNECTION 'connstr-to-B'
PUBLICATION sync
WITH (
streaming=true, copy_data=false,
origin=none, synchronous_commit='on');

On B, run:

CREATE SUBSCRIPTION sync
CONNECTION 'connstr-to-A'
PUBLICATION sync
WITH (
streaming=true, copy_data=false,
origin=none, synchronous_commit='on');

On A and B, run:

ALTER SYSTEM SET synchronous_standby_names TO 'sync';
SELECT pg_reload_conf();

Verify on A and B that pg_stat_replication.sync_state is set to "sync"

SELECT application_name, sync_state = 'sync' AS is_sync
FROM pg_stat_replication
WHERE application_name = 'sync';

The next to commands should be run simultaneously on A and B:

-- run this on A
INSERT INTO sync
SELECT x, random() FROM generate_series(1,2000000, 2) x;

-- run this on B
INSERT INTO sync
SELECT x, random() FROM generate_series(2,2000000, 2) x;

This consistently created the deadlock in my testing.

Discussing with Masahiko off-list, this is due to a deadlock from 4
processes: the walsenders on A and B, and the apply workers on A and B.
The walsenders are waiting for feedback from the apply workers, and the
apply workers are waiting for the walsenders to synchronize (I may be
oversimplifying).

He suggested I try the above example instead with `synchronous_commit`
set to `local`. In this case, I verified that there is no more deadlock,
but he informed me that we would not be able to use cascading
synchronous replication when "origin=none".

If we decide that this is a documentation issue, I'd suggest we improve
the guidance around using `synchronous_commit`[1] on the CREATE
SUBSCRIPTION page, as the GUC page[2] warns against using `local`:

"The setting local causes commits to wait for local flush to disk, but
not for replication. This is usually not desirable when synchronous
replication is in use, but is provided for completeness."

Thanks,

Jonathan

[1] https://www.postgresql.org/docs/devel/sql-createsubscription.html
[2]
https://www.postgresql.org/docs/devel/runtime-config-wal.html#GUC-SYNCHRONOUS-COMMIT

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Regina Obe 2023-01-10 03:03:11 RE: [PATCH] Support % wildcard in extension upgrade filenames
Previous Message Tom Lane 2023-01-10 02:38:03 Re: ATTACH PARTITION seems to ignore column generation status