Re: Built-in Raft replication

From: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Konstantin Osipov <kostja(dot)osipov(at)gmail(dot)com>, Greg Sabino Mullane <htamfids(at)gmail(dot)com>, Nikolay Samokhvalov <nik(at)postgres(dot)ai>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Built-in Raft replication
Date: 2025-04-16 04:07:28
Message-ID: 20FB597F-641F-48F8-8428-D8DDBA802D58@yandex-team.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On 16 Apr 2025, at 04:19, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> feebly, and seems to have a bus factor of 1. Another example is the
> Spencer regex engine; we thought we could depend on Tcl to be the
> upstream for that, but for a decade or more they've acted as though
> *we* are the upstream.

I think it's what Konstantin is proposing. To have our own Raft implementation, without dependencies.

IMO to better understand what is proposed we need some more description of proposed systems. How the new system will be configured? initdb and what than? How new node joins cluster? What is running pg_rewind when necessary?

Some time ago Peter E proposed to be able to start replication atop of empty directory, so that initial sync would be more straightforward. And also Heikki proposed to remove archive race condition when choosing new timeline. I think this steps are gradual movement in the same direction.

My view is what Konstantin wants is automatic replication topology management. For some reason this technology is called HA, DCS, Raft, Paxos and many other scary words. But basically it manages primary_conn_info of some nodes to provide some fault-tolerance properties. I'd start to design from here, not from Raft paper.

Best regards, Andrey Borodin.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2025-04-16 04:26:53 Re: Built-in Raft replication
Previous Message Alexander Lakhin 2025-04-16 04:00:01 Re: recoveryCheck test failure on flaviventris