RE: CDC/ETL system on top of logical replication with pgoutput, custom client

From: José Neves <rafaneves3(at)msn(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: CDC/ETL system on top of logical replication with pgoutput, custom client
Date: 2023-10-25 10:53:47
Message-ID: PR3P193MB0491A31D7E2C6262A0873CE689DEA@PR3P193MB0491.EURP193.PROD.OUTLOOK.COM
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Ok, I see. In that situation is safe indeed, as the offset is lower than the current transaction commit.
But I think that I asked the wrong question. I guess that the right question is: Can we receive a keep-alive message with an LSN offset bigger than the commit of the open or following transactions?
Something like:

BEGIN LSN001
INSERT LSN002
KEEP LIVE LSN006
UPDATE LSN004
COMMIT LSN005

Or:

KEEP LIVE LSN006
BEGIN LSN001
INSERT LSN002
UPDATE LSN004
COMMIT LSN005
KEEP LIVE LSN007

Or is the sequence ensured not only between commits but also with keep-alive messaging?
________________________________
De: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
Enviado: 25 de outubro de 2023 11:42
Para: José Neves <rafaneves3(at)msn(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>; Andres Freund <andres(at)anarazel(dot)de>; pgsql-hackers(at)postgresql(dot)org <pgsql-hackers(at)postgresql(dot)org>
Assunto: Re: CDC/ETL system on top of logical replication with pgoutput, custom client

On Tue, Oct 24, 2023 at 8:53 PM José Neves <rafaneves3(at)msn(dot)com> wrote:
>
> Hi there, hope to find you well.
>
> I have a follow-up question to this already long thread.
>
> Upon deploying my PostgreSQL logical replication fed application on a stale database, I ended up running out of space, as the replication slot is being held back till the next time that we receive a data-changing event, and we advance to that new LSN offset.
> I think that the solution for this is to advance our LSN offset every time a keep-alive message is received ('k' // 107).
> My doubt is, can the keep-alive messages be received in between open transaction events? I think not, but I would like to get your input to be extra sure as if this happens, and I commit that offset, I may introduce again faulty logic leading to data loss.
>
> In sum, something like this wouldn't happen:
> BEGIN LSN001
> INSERT LSN002
> KEEP LIVE LSN003
> UPDATE LSN004
> COMMIT LSN005
>

If the downstream acknowledges receipt of LSN003 and saves it locally
and crashes, upon restart the upstream will resend all the
transactions that committed after LSN003 including the one ended at
LSN005. So this is safe.

--
Best Wishes,
Ashutosh

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message a.rybakina 2023-10-25 11:04:24 Re: POC, WIP: OR-clause support for indexes
Previous Message Ashutosh Bapat 2023-10-25 10:42:25 Re: CDC/ETL system on top of logical replication with pgoutput, custom client