Re: Advice on cluster architecture for two related, but distinct, use cases

From: Zahid Rahman <zahidr1000(at)gmail(dot)com>
To: Matthias Leisi <matthias(at)leisi(dot)net>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: Advice on cluster architecture for two related, but distinct, use cases
Date: 2024-11-07 23:49:16
Message-ID: CAPGSW3TdsjQZykFrB=hR4fnZCaSacdi_EOuWZndOBWfGWGNAUQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Perhaps a 14 minute investment in this article may prove fruitful.

https://medium.com/@martin.hodges/adding-a-postgres-high-availability-database-to-your-kubernetes-cluster-634ea5d6e4a1

On Thu, 7 Nov 2024, 21:06 Matthias Leisi, <matthias(at)leisi(dot)net> wrote:

> Dear all,
>
> (This is a follow-up to a question I asked almost exactly a year ago,
> https://postgrespro.com/list/thread-id/2670756#726F3765-858C-4AC0-A7B0-5CB6720E4B37(at)leisi(dot)net -
> the requirements have changed since then, and the platform has changed from
> OpenBSD to Linux, which may make some things easier.)
>
>
> I’m looking for advice on Postgres cluster architecture(s) for two related
> but distinct use cases. Ideally, the approaches for the two use cases would
> not differ too widely.
>
> The goal of clustering is low RPO (I guess we need sync clustering) and
> RTO (ideally almost-instant failover, but a failover process of up to a
> minute in the worst case could be acceptable); throughput is not a concern
> (it’s relatively low transaction volume except for some often-written
> statistics data, which is still moderate). Latency (due to the distance
> between datacenters for georedundancy) is a fact we are willing to accept.
>
>
> The first use case is in an environment under our own control (and where
> eg a DBA could intervene). We can theoretically run any number of cluster
> instances, but assume we would use an even number (split over the two
> datacenters), or potentially an odd number of nodes (eg with an arbiter).
> We could use a load balancer, but I guess this would strongly deviate from
> the second use case:
>
>
> In the second use case, the environment is not under our control, so we
> can only assume basic network connectivity from the application to the DB,
> and between the DBs (the latter potentially through an SSH tunnel if
> needed). In this use case, we can not assume a person to intervene if a
> node goes down, and would prefer some automated failover to the other node
> (this automation would also be welcome for the first use case, eg if
> something happens while nobody is watching). We can not assume eg a load
> balancer.
>
> There could be various ways how the environment in the second use case is
> set up, ranging from „application and database running on the same box“
> (well, no clustering for you then…), to dedicated two- or three node
> database cluster serving a number of application machines.
>
>
> In both use cases, we have full control over the application and the
> database code and environment.
>
> From reading various docs, it seems we would need something like Patroni
> (/Percona), at least for the first use case. However it seems relatively
> complex to set up and operate.
>
> I would appreciate your experience and input into which approach would
> best fit the two use cases. We are also willing to engage in paid
> consulting.
>
> Thanks,
> — Matthias
>
> --
> Matthias Leisi
> Katzenrütistrasse 68, 8153 Rümlang
> <https://www.google.com/maps/search/Katzenr%C3%BCtistrasse+68,+8153+R%C3%BCmlang?entry=gmail&source=g>
> Mobile +41 79 377 04 43
> matthias(at)leisi(dot)net
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message ravi k 2024-11-08 03:07:21 Re: Performance Issue with Hash Partition Query Execution in PostgreSQL 16
Previous Message Matthias Leisi 2024-11-07 21:06:29 Advice on cluster architecture for two related, but distinct, use cases