Need Detailed to build real time CDC Data Pipeline

From: Puja Anil AJMERA <puja(dot)anil(dot)ajmera(at)hsbc(dot)co(dot)in>
To: "pgsql-general(at)lists(dot)postgresql(dot)org" <pgsql-general(at)lists(dot)postgresql(dot)org>
Cc: Vivekananda SOMPALLE <vivekananda(dot)sompalle(at)hsbc(dot)co(dot)in>
Subject: Need Detailed to build real time CDC Data Pipeline
Date: 2023-02-20 04:12:17
Message-ID: DB8P191MB073086C14AA343760E726FBAFFA49@DB8P191MB0730.EURP191.PROD.OUTLOOK.COM
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi Team,

We have use case to build a real time data pipeline from PostgresSQL to sync data with other DB. We have created logical slots to read WAL logs.
We are exploting some markets tools as well like ‘Debezium’, still want to have one option without relying on any other open source tool.

We came across 2 functions :

1. ‘pg_logical_slot_get_changes’ function will give us all the slot changes from previous read point, but once we read it the changed will be deleted from the slot.
2. ‘pg_logical_slot_peek_changes’ will give us the changes from previous read point, but it will not delete it.

Questions :
#1. How can we guarantee that once changes read and we update them into desired database, we would still have a place to have the WAL logs to read again if anything goes wrong while processing ?

#2. Also ther are parameters in ‘pg_logical_slot_get_changes’ , unable to find the examples to use the ‘upto_lsn,and ‘upto_nchanges’ and ‘VARIADIC options text[]’. Some examples and usecases to support it will be helpful for us.

* Reference : https://pgpedia.info/p/pg_logical_slot_get_changes.html

- pg_logical_slot_get_changes ( slot_name name, upto_lsn pg_lsn, upto_nchanges integer, VARIADIC options text[] ) →
setof record ( lsn pg_lsn, xid xid, data text )

Appreciate your early response.

Regards,
Puja Ajmera

INTERNAL

*******************************************************************
This e-mail is confidential. It may also be legally privileged.
If you are not the addressee you may not copy, forward, disclose
or use any part of it. If you have received this message in error,
please delete it and all copies from your system and notify the
sender immediately by return e-mail.
Internet communications cannot be guaranteed to be timely,
secure, error or virus-free. The sender does not accept liability
for any errors or omissions.
*******************************************************************
"SAVE PAPER - THINK BEFORE YOU PRINT!"

Browse pgsql-general by date

  From Date Subject
Next Message Amit Kapila 2023-02-20 04:14:48 Re: Support logical replication of DDLs
Previous Message Tom Lane 2023-02-20 00:17:45 Re: A performance issue in ROW_NUMBER() OVER(ORDER BY NULL) [27 times slow than OVER()] V14.5