Re: Unexpected data when subscribing to logical replication slot

From: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
To: Daniel McKenzie <daniel(dot)mckenzie(at)curvedental(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: Unexpected data when subscribing to logical replication slot
Date: 2024-05-08 11:32:45
Message-ID: 448c5c84-52db-470f-a01c-57776a9914aa@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Hi,

On 5/8/24 11:17, Daniel McKenzie wrote:
> We have a .NET application which subscribes to a logical replication slot
> using wal2json. The purpose of the application is to publish events to AWS
> SQS. We are required to first "enrich" these events by querying the
> database.
>
> We have found that these queries will often find old data (i.e. the data
> that existed prior to the update) which is unexpected.
>
> For example, when I use a psql terminal to update a user's last name from
> "Jones" to "Smith" then I would expect the enrichment query to find "Smith"
> but it will sometimes still find "Jones". It finds the old data perhaps 1
> in 50 times.
>
> To reproduce this I use a psql terminal to execute an update statement
> which changes a user's last name with \watch 1.5 and monitor our
> application logs for cases where the wal2json output and the enrichment
> query output have different last names.
>

Where/how does the enrichment query run? How does the whole process look
like? I guess an application is receiving decoded changes as JSON, and
then querying the database?

> We have compared transaction ids by adding include-xids to pg_recvlogical
> and adding txid_current() to to the enrich query and the txid_current() is
> always the xid + 1.
>
> We have found two things that appear to resolve the problem -
>
> - Using a more powerful EC2 instance. We can reproduce the issue with a
> r7a.medium instance but not with a r7a.large EC2 instance.
> - Changing the Postgres synchronous_commit parameter from "on" to "off".
> We cannot reproduce the issue with synchronous_commit set to "off".
>
> We need help to understand this unexpected behaviour.
>

Would be good to have some sort of reproducer - ideally a script that
sets up an instance + replication, and demonstrates the issue. Or at
least a sufficiently detailed steps to reproduce it without having to
guess what exactly you did.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Marcin Barczyński 2024-05-08 13:41:55 How to change NUMERIC type under a domain without rewriting a table?
Previous Message Daniel McKenzie 2024-05-08 09:17:40 Unexpected data when subscribing to logical replication slot