RE: Conflict detection and logging in logical replication

From: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
To: "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Jan Wieck <jan(at)wi3ck(dot)info>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Subject: RE: Conflict detection and logging in logical replication
Date: 2024-08-21 03:05:47
Message-ID: OS0PR01MB57163CAC0574ECFEB0971A31948E2@OS0PR01MB5716.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wednesday, August 21, 2024 9:33 AM Jonathan S. Katz <jkatz(at)postgresql(dot)org> wrote:
> On 8/6/24 4:15 AM, Zhijie Hou (Fujitsu) wrote:
>
> > Thanks for the idea! I thought about few styles based on the suggested
> > format, what do you think about the following ?
>
> Thanks for proposing formats. Before commenting on the specifics, I do want to
> ensure that we're thinking about the following for the log formats:
>
> 1. For the PostgreSQL logs, we'll want to ensure we do it in a way that's as
> convenient as possible for people to parse the context from scripts.

Yeah. And I personally think the current log format is OK for parsing purposes.

>
> 2. Semi-related, I still think the simplest way to surface this info to a user is
> through a "pg_stat_..." view or similar catalog mechanism (I'm less opinionated
> on the how outside of we should make it available via SQL).

We have a patch(v19-0002) in this thread to collect conflict stats and display
them in the view, and the patch is under review.

Storing it into a catalog needs more analysis as we may need to add addition
logic to clean up old conflict data in that catalog table. I think we can
consider it as a future improvement.

>
> 3. We should ensure we're able to convey to the user these details about the
> conflict:
>
> * What time it occurred on the local server (which we'd have in the logs)
> * What kind of conflict it is
> * What table the conflict occurred on
> * What action caused the conflict
> * How the conflict was resolved (ability to include source/origin info)

I think all above are already covered in the current conflict log. Except that
we have not support resolving the conflict, so we don't log the resolution.

>
>
> I think outputting the remote/local tuple value may be a parameter we need to
> think about (with the desired outcome of trying to avoid another parameter). I
> have a concern about unintentionally leaking data (and I understand that
> someone with access to the logs does have a broad ability to view data); I'm
> less concerned about the size of the logs, as conflicts in a well-designed
> system should be rare (though a conflict storm could fill up the logs, likely there
> are other issues to content with at that point).

We could use an option to control, but the tuple value is already output in some
existing cases (e.g. partition check, table constraints check, view with check
constraints, unique violation), and it would test the current user's
privileges to decide whether to output the tuple or not. So, I think it's OK
to display the tuple for conflicts.

Best Regards,
Hou zj

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2024-08-21 03:09:27 Re: [BUG] Fix DETACH with FK pointing to a partitioned table fails
Previous Message Peter Smith 2024-08-21 03:02:55 Re: Logical Replication of sequences