Re: Conflict detection and logging in logical replication

From: "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
To: "Zhijie Hou (Fujitsu)" <houzj(dot)fnst(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Nisha Moond <nisha(dot)moond412(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Jan Wieck <jan(at)wi3ck(dot)info>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Subject: Re: Conflict detection and logging in logical replication
Date: 2024-08-21 01:32:30
Message-ID: 06bd494f-fc2f-4ca0-8df8-e3623fd0ba16@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 8/6/24 4:15 AM, Zhijie Hou (Fujitsu) wrote:

> Thanks for the idea! I thought about few styles based on the suggested format,
> what do you think about the following ?

Thanks for proposing formats. Before commenting on the specifics, I do
want to ensure that we're thinking about the following for the log formats:

1. For the PostgreSQL logs, we'll want to ensure we do it in a way
that's as convenient as possible for people to parse the context from
scripts.

2. Semi-related, I still think the simplest way to surface this info to
a user is through a "pg_stat_..." view or similar catalog mechanism (I'm
less opinionated on the how outside of we should make it available via SQL).

3. We should ensure we're able to convey to the user these details about
the conflict:

* What time it occurred on the local server (which we'd have in the logs)
* What kind of conflict it is
* What table the conflict occurred on
* What action caused the conflict
* How the conflict was resolved (ability to include source/origin info)

With that said:

> ---
> Version 1
> ---
> LOG: CONFLICT: insert_exists; DESCRIPTION: remote INSERT violates unique constraint "uniqueindex" on relation "public.test".
> DETAIL: Existing local tuple (a, b, c) = (2, 3, 4) xid=123,origin="pub",timestamp=xxx; remote tuple (a, b, c) = (2, 4, 5).
>
> LOG: CONFLICT: update_differ; DESCRIPTION: updating a row with key (a, b) = (2, 4) on relation "public.test" was modified by a different source.
> DETAIL: Existing local tuple (a, b, c) = (2, 3, 4) xid=123,origin="pub",timestamp=xxx; remote tuple (a, b, c) = (2, 4, 5).
>
> LOG: CONFLICT: update_missing; DESCRIPTION: did not find the row with key (a, b) = (2, 4) on "public.test" to update.
> DETAIL: remote tuple (a, b, c) = (2, 4, 5).

I agree with Amit's downthread comment, I think this tries to put much
too much info on the LOG line, and it could be challenging to parse.

> ---
> Version 2
> It moves most the details to the DETAIL line compared to version 1.
> ---
> LOG: CONFLICT: insert_exists on relation "public.test".
> DETAIL: Key (a)=(1) already exists in unique index "uniqueindex", which was modified by origin "pub" in transaction 123 at 2024xxx;
> Existing local tuple (a, b, c) = (1, 3, 4), remote tuple (a, b, c) = (1, 4, 5).
>
> LOG: CONFLICT: update_differ on relation "public.test".
> DETAIL: Updating a row with key (a, b) = (2, 4) that was modified by a different origin "pub" in transaction 123 at 2024xxx;
> Existing local tuple (a, b, c) = (2, 3, 4); remote tuple (a, b, c) = (2, 4, 5).
>
> LOG: CONFLICT: update_missing on relation "public.test".
> DETAIL: Did not find the row with key (a, b) = (2, 4) to update;
> Remote tuple (a, b, c) = (2, 4, 5).

I like the brevity of the LOG line, while it still provides a lot of
info. I think we should choose the words/punctuation in the DETAIL
carefully so it's easy for scripts to ultimately parse (even if we
expose info in pg_stat, people may need to refer to older log files to
understand how a conflict resolved).

> ---
> Version 3
> It is similar to the style in the current patch, I only added the key value for
> differ and missing conflicts without outputting the complete
> remote/local tuple value.
> ---
> LOG: conflict insert_exists detected on relation "public.test".
> DETAIL: Key (a)=(1) already exists in unique index "uniqueindex", which was modified by origin "pub" in transaction 123 at 2024xxx.
>
> LOG: conflict update_differ detected on relation "public.test".
> DETAIL: Updating a row with key (a, b) = (2, 4), which was modified by a different origin "pub" in transaction 123 at 2024xxx.
>
> LOG: conflict update_missing detected on relation "public.test"
> DETAIL: Did not find the row with key (a, b) = (2, 4) to update.

I think outputting the remote/local tuple value may be a parameter we
need to think about (with the desired outcome of trying to avoid another
parameter). I have a concern about unintentionally leaking data (and I
understand that someone with access to the logs does have a broad
ability to view data); I'm less concerned about the size of the logs, as
conflicts in a well-designed system should be rare (though a conflict
storm could fill up the logs, likely there are other issues to content
with at that point).

Thanks,

Jonathan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message bucoo 2024-08-21 01:49:37 optimize hashjoin
Previous Message px shi 2024-08-21 01:11:03 Re: [Bug Fix]standby may crash when switching-over in certain special cases