From: | Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com> |
---|---|
To: | Andrey Borodin <x4mmm(at)yandex-team(dot)ru> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Konstantin Osipov <kostja(dot)osipov(at)gmail(dot)com>, Greg Sabino Mullane <htamfids(at)gmail(dot)com>, Nikolay Samokhvalov <nik(at)postgres(dot)ai>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Built-in Raft replication |
Date: | 2025-04-16 04:33:15 |
Message-ID: | CAExHW5udz081Unx5qcLZ89DnR4cp6OhW0TCoco3+AEkGw480uQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Apr 16, 2025 at 9:37 AM Andrey Borodin <x4mmm(at)yandex-team(dot)ru> wrote:
>
> My view is what Konstantin wants is automatic replication topology management. For some reason this technology is called HA, DCS, Raft, Paxos and many other scary words. But basically it manages primary_conn_info of some nodes to provide some fault-tolerance properties. I'd start to design from here, not from Raft paper.
>
In my experience, the load of managing hundreds of replicas which all
participate in RAFT protocol becomes more than regular transaction
load. So making every replica a RAFT participant will affect the
ability to deploy hundreds of replica. We may build an extension which
has a similar role in PostgreSQL world as zookeeper in Hadoop. It can
be then used for other distributed systems as well - like shared
nothing clusters based on FDW. There's already a proposal to bring
CREATE SERVER to the world of logical replication - so I see these two
worlds uniting in future. The way I imagine it is some PostgreSQL
instances, which have this extension installed, will act as a RAFT
cluster (similar to Zookeeper ensemble or etcd cluster). The
distributed system based on logical replication or FDW or both will
use this ensemble to manage its shared state. The same ensemble can be
shared across multiple distributed clusters if it has scaling
capabilities.
--
Best Wishes,
Ashutosh Bapat
From | Date | Subject | |
---|---|---|---|
Next Message | Andrey Borodin | 2025-04-16 04:58:55 | Re: Built-in Raft replication |
Previous Message | Thomas Munro | 2025-04-16 04:30:52 | Re: BitmapHeapScan streaming read user and prelim refactoring |