Re: Conflict detection and logging in logical replication

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>
Cc: "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Jan Wieck <jan(at)wi3ck(dot)info>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Subject: Re: Conflict detection and logging in logical replication
Date: 2024-08-21 10:35:21
Message-ID: CAA4eK1LMvgageidw+w01e=2E+Ki-fWHmFuj-gfd0CtCjycvj+Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 21, 2024 at 8:35 AM Zhijie Hou (Fujitsu)
<houzj(dot)fnst(at)fujitsu(dot)com> wrote:
>
> On Wednesday, August 21, 2024 9:33 AM Jonathan S. Katz <jkatz(at)postgresql(dot)org> wrote:
> > On 8/6/24 4:15 AM, Zhijie Hou (Fujitsu) wrote:
> >
> > > Thanks for the idea! I thought about few styles based on the suggested
> > > format, what do you think about the following ?
> >
> > Thanks for proposing formats. Before commenting on the specifics, I do want to
> > ensure that we're thinking about the following for the log formats:
> >
> > 1. For the PostgreSQL logs, we'll want to ensure we do it in a way that's as
> > convenient as possible for people to parse the context from scripts.
>
> Yeah. And I personally think the current log format is OK for parsing purposes.
>
> >
> > 2. Semi-related, I still think the simplest way to surface this info to a user is
> > through a "pg_stat_..." view or similar catalog mechanism (I'm less opinionated
> > on the how outside of we should make it available via SQL).
>
> We have a patch(v19-0002) in this thread to collect conflict stats and display
> them in the view, and the patch is under review.
>

IIUC, Jonathan is asking to store the conflict information (the one we
display in LOGs). We can do that separately as that is useful.

> Storing it into a catalog needs more analysis as we may need to add addition
> logic to clean up old conflict data in that catalog table. I think we can
> consider it as a future improvement.
>

Agreed. The cleanup part needs more consideration.

> >
> > 3. We should ensure we're able to convey to the user these details about the
> > conflict:
> >
> > * What time it occurred on the local server (which we'd have in the logs)
> > * What kind of conflict it is
> > * What table the conflict occurred on
> > * What action caused the conflict
> > * How the conflict was resolved (ability to include source/origin info)
>
> I think all above are already covered in the current conflict log. Except that
> we have not support resolving the conflict, so we don't log the resolution.
>
> >
> >
> > I think outputting the remote/local tuple value may be a parameter we need to
> > think about (with the desired outcome of trying to avoid another parameter). I
> > have a concern about unintentionally leaking data (and I understand that
> > someone with access to the logs does have a broad ability to view data); I'm
> > less concerned about the size of the logs, as conflicts in a well-designed
> > system should be rare (though a conflict storm could fill up the logs, likely there
> > are other issues to content with at that point).
>
> We could use an option to control, but the tuple value is already output in some
> existing cases (e.g. partition check, table constraints check, view with check
> constraints, unique violation), and it would test the current user's
> privileges to decide whether to output the tuple or not. So, I think it's OK
> to display the tuple for conflicts.
>

The current information is displayed keeping in mind that users should
be able to use that to manually resolve conflicts if required. If we
think there is a leak of information (either from a security angle or
otherwise) like tuple data then we can re-consider. However, as we are
displaying tuple information in other places as pointed out by
Hou-San, we thought it is also okay to display in this case.

--
With Regards,
Amit Kapila.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nisha Moond 2024-08-21 10:38:00 Re: Conflict Detection and Resolution
Previous Message Tender Wang 2024-08-21 10:25:01 Small code simplification