Read Replica Inconsistencies

From: Giuliano Sofi <giuliano(dot)sofi(at)cloudacademy(dot)com>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: Developers Group <developers(at)cloudacademy(dot)com>
Subject: Read Replica Inconsistencies
Date: 2022-09-26 09:06:49
Message-ID: CAHjPeqvLb9ewP0ubvZdCaafZWriJRuE3Qr7hUkNn5qZQfBxD_A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hello folks,

In the last few weeks we started facing some strange behavior in our AWS
RDS Replica. Following are all the details about our setup, the problem we
are experiencing and the investigation we did.

Our setup

Service: Amazon RDS Postgres Replica

Postgres Version: 13.7

Replication Model: Single-Master

Class: Unidirectional

Mode: Asynchronous Streaming

Type: Physical Replication

Resources: Both the Primary and the Read Replica are managed by Amazon RDS.
The Primary instance is a db.m6g.2xlarge, instead the replica is db.m6g.large.
Both have 640 GB gp2 disks and Multi AZ option enabled.

The problem

We are experiencing violation of primary key constraints with the
replication process, in detail:

A couple of times per day, some tables for which a primary key is defined
on the master, contain duplicated records on the read replica (the primary
key constraint is violated), and we don’t get any error/warning from the
replica instance. Such duplications are found with our tests and monitoring
tools, and the replica usually removes them in a couple of minutes, but in
some rare cases it needs up to 4 hours for getting back to a consistent
state.

Configuration

We tried to compare the configuration of the two instances without finding
big differences, anyway we are attaching them to this message so you can
check them.

We would like to understand if there is something we can tune to prevent
such duplications in our production replica.

Here you can find the configuration files: attachments
<https://drive.google.com/drive/folders/1hEjQVJGygBntxYsDCKYTlayE8ghmhrOn?usp=sharing>

Investigation

Transaction Log Disk Usage

We use our replica instance as the source for our internal data pipelines.
These pipelines run on an hourly schedule. From the monitoring dashboards
we see that each time the ETLs start reading, there is an increase on the
replica LAG metric, that is expected. What is less expected, it’s an
occasional spike in the Transaction Log Disk Usage metric that shows peaks
up to 3GB, and seems unrelated to the LAG metric.

Do you think these spikes could be related to our issue?

13.8 version TransactionIdIsInProgress bugfix

We've also noticed that, in the 13.8 release, a bug related to replication
has been addressed. The https://www.postgresql.org/docs/release/13.8/
states that:

-

Fix race condition when checking transaction visibility (Simon Riggs)
TransactionIdIsInProgress could report false before the subject
transaction is considered visible, leading to various misbehaviors. The
race condition window is normally very narrow, but use of synchronous
replication makes it much wider, because the wait for a synchronous replica
happens in that window.

Do you think that we could be affected by this issue? Since Amazon RDS
doesn't support such a version, we can't upgrade to check if this solves
the problem. Does this fix have been also ported in the 14.x version of
Postgres? If yes, and you agree that this could be our issue, we could
evaluate an upgrade to address the issue as soon as possible.

Master serial primary keys are varchar in the replica

In the replica instance id columns of serial type are turned to varchar. Given
a table on the master with a primary key composed of a SERIAL column, the
replicated table DDL is different. The difference is that the replicated
table primary key column is a VARCHAR instead of a SERIAL/INT. We do not
apply any strategy/configuration for dealing with ids, so this conversion
is made automatically during the replication build/streaming.

The following screenshots depict the difference in primary key type of the
master table and its replicated version:

If the serial4 type cannot be kept during replication, we would expect at
least to fallback on a number format and be subject to the defined
constraints.

We thought about possible problems related to the unique constraint caused
by collation problems, so we ran the bt_index_check() function from the amcheck
extension on the primary key indexes of the replica, without finding any
issue.

Please could you help us on figuring out how to solve this issue?

Thanks a lot for your help

The CloudAcademy team

Browse pgsql-bugs by date

  From Date Subject
Next Message Michael Paquier 2022-09-26 09:54:59 Re: BUG #17385: "RESET transaction_isolation" inside serializable transaction causes Assert at the transaction end
Previous Message Alexander Kukushkin 2022-09-26 07:08:25 Re: pg_rewind WAL segments deletion pitfall