Quick Links

Re: Multi master disjoint cluster

From:	Vijaykumar Jain <vijaykumarjain(dot)github(at)gmail(dot)com>
To:	pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject:	Re: Multi master disjoint cluster
Date:	2022-10-26 17:04:13
Message-ID:	CAM+6J95ZzSrdqLTi6fRquOP8gkn_R4YaqZPCoFp+itDhoyV77w@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

Actually, pls ignore my email.

re reading my mail makes it look like I did not research it throughly and
just asked without actual implementation of both options and having a clear
goal on what can incompromise along with no read downtime.
I'll write better next time.

On Wed, Oct 26, 2022, 10:04 PM Vijaykumar Jain <
vijaykumarjain(dot)github(at)gmail(dot)com> wrote:

> Hi all,
>
> tl;dr
> I have a simple question,
> given a choice if I can write the same data to two databases in parallel,
> should I opt for primary / replica setup or multi writer/master setup. This
> setup has the ability to make use of kafka consumer groups (like two
> replication slots each having their own lsn offsets) to write to both db
> node pairs in parallel via the application layer.
>
> The churn of data is really high, there is a lot of wals generated, around
> 500gb/hr.
>
> If I go with primary/replica, (lr not feasible)
>
> I need to ensure both are on the same major version. Upgrades are tricky
> (we don't have qa) so we just have option to schema dump for upgrade
> compatibility. Data, we trust postgresql for that :). (I wish we had zfs
> everywhere but no )
>
> Any excl table blocking operations, (although with later versions there
> are very less blocking operations) can impact queries on replica as well
> (excluding delay settings).
>
> Corruption can cause downtime (we have tons of them and raids to protect
> them) so if replica is having issues, we can zero the pages on the replica
> and do some operations if we isolate the problem pages, else resync the
> replica from primary. But if primary is having some issues, we copy data
> from replica to disk and copy in to primary after truncating etc. Some
> downtime but not a lot. (I am not expert at data recovery) and mostly rely
> on amcheck, dd, and raid checks.
>
> We don't use pitr (too many wals × 58) or delayed replication as we can't
> afford more servers.
>
> ddl deploys are guaranteed by replication. So no need to try 2pc like
> stuff at app layer. (Although apps use deploy tools to ensure eventually
> the ddls are consistent and idempotent)
>
> Basically primary/replica relieves the app to think what is there on the
> primary is also on the replica eventually, so there can be source of truth.
>
> But with multi writers, any app mishandling like bug in catching exception
> etc can result in diversion and no more mirrored setup.
> We need to have checks/reconciliation to ensure both write nodes in pair
> have almost similar data at the end of the day so we can trust this setup
> independent of any app mistakes.
>
> But if app layer gets robust, we have almost no downtime in reads and
> writes, we can have both nodes on different versions, (w/o logical
> replication) can query both nodes real time, no real replication lag issues
> , conflicts etc, can upgrade like blue green, canary test some changes on
> one if needed etc.
>
> Am I making sense at all? Or I am sounding confused, and I don't know the
> difference between primary/replica vs multi writer. This is not bdr like
> thing, they don't really need each other unless we are into some recovery.
>
> My point is, we have 58 such primary/replica shards (each 10tb+)
> (consistent hashing at app layer, no fdw) and there is no scope of
> downtime for reads, so any issue like post upgrade performance degradation
> (if any) gives me chills. and we have no qa to test real data.
>
> There are too many dimensions to shard on and aggregations need to run
> across the shards (Yes there is no scope of data isolation).
>
>

In response to

Multi master disjoint cluster at 2022-10-26 16:34:16 from Vijaykumar Jain

Browse pgsql-general by date

	From	Date	Subject
Next Message	Zheng Li	2022-10-26 20:39:24	Re: Support logical replication of DDLs
Previous Message	Vijaykumar Jain	2022-10-26 16:34:16	Multi master disjoint cluster