Re: Built-in Raft replication

From: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>
To: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Konstantin Osipov <kostja(dot)osipov(at)gmail(dot)com>, Greg Sabino Mullane <htamfids(at)gmail(dot)com>, Nikolay Samokhvalov <nik(at)postgres(dot)ai>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Built-in Raft replication
Date: 2025-04-16 04:33:15
Message-ID: CAExHW5udz081Unx5qcLZ89DnR4cp6OhW0TCoco3+AEkGw480uQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 16, 2025 at 9:37 AM Andrey Borodin <x4mmm(at)yandex-team(dot)ru> wrote:
>
> My view is what Konstantin wants is automatic replication topology management. For some reason this technology is called HA, DCS, Raft, Paxos and many other scary words. But basically it manages primary_conn_info of some nodes to provide some fault-tolerance properties. I'd start to design from here, not from Raft paper.
>

In my experience, the load of managing hundreds of replicas which all
participate in RAFT protocol becomes more than regular transaction
load. So making every replica a RAFT participant will affect the
ability to deploy hundreds of replica. We may build an extension which
has a similar role in PostgreSQL world as zookeeper in Hadoop. It can
be then used for other distributed systems as well - like shared
nothing clusters based on FDW. There's already a proposal to bring
CREATE SERVER to the world of logical replication - so I see these two
worlds uniting in future. The way I imagine it is some PostgreSQL
instances, which have this extension installed, will act as a RAFT
cluster (similar to Zookeeper ensemble or etcd cluster). The
distributed system based on logical replication or FDW or both will
use this ensemble to manage its shared state. The same ensemble can be
shared across multiple distributed clusters if it has scaling
capabilities.

--
Best Wishes,
Ashutosh Bapat

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey Borodin 2025-04-16 04:58:55 Re: Built-in Raft replication
Previous Message Thomas Munro 2025-04-16 04:30:52 Re: BitmapHeapScan streaming read user and prelim refactoring