Re: Built-in Raft replication

From: Michael Banck <mbanck(at)gmx(dot)net>
To: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Konstantin Osipov <kostja(dot)osipov(at)gmail(dot)com>, Greg Sabino Mullane <htamfids(at)gmail(dot)com>, Nikolay Samokhvalov <nik(at)postgres(dot)ai>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Built-in Raft replication
Date: 2025-04-16 07:50:45
Message-ID: 67ff6156.050a0220.18223a.416b@mx.google.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Wed, Apr 16, 2025 at 10:24:48AM +0500, Andrey Borodin wrote:
> I think I can provide some reasons why it cannot be neither extension,
> nor any part running within postmaster reign.
>
> 1. When joining cluster, there’s not PGDATA to run postmaster on top
> of it.
>
> 2. After failover, old Primary node must rejoin cluster by running
> pg_rewind and following timeline switch.
>
> The system in hand must be able to manipulate with PGDATA without
> starting Postgres.

Yeah, while you could maybe implement some/all of the RAFT protocol in
an extension, actually building something useful on top with regards to
high availability or distributed whatever does not look feasible.

> My question to Konstantin is Why wouldn’t you just add Raft to
> Patroni?

Patroni can use pysyncobj, which is a Python implementation of RAFT, so
then you do not need an external RAFT provider like etcd, consul or
zookeeper. However, it is deemed deprecated by the Patroni authors due
to being difficult to debug when it breaks.

I guess a better Python implementation of RAFT for Patroni to use or
Patroni to implement it itself would help, but I believe nobody is
working on the latter right now, nor has any plans to do so. And there
also does not seem to be anybody working on a better pysyncobj.

> Is there a reason why something like Patroni is not in core and noone
> rushes to get it in? Everyone is using it, or system like it.

Well, Patroni is written in Python, for starters. It also does a lot
more than just leader election / cluster config. So I think nobody
seriously thought about proposing to put Patroni into core so far.

I guess the current proposal tries to be a step into the "something like
Patroni in core" if you tilt your head a little. It's just that the
whole thing would be a really big step for Postgres, maybe similar to
deciding we want in-core replication way back when.

Michael

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Maxim Orlov 2025-04-16 08:11:46 Re: POC: make mxidoff 64 bits
Previous Message Fabien Coelho 2025-04-16 07:40:52 Re: Add partial :-variable expansion to psql \copy