Re: Built-in Raft replication

From: Kirill Reshke <reshkekirill(at)gmail(dot)com>
To: Konstantin Osipov <kostja(dot)osipov(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Built-in Raft replication
Date: 2025-04-14 17:44:32
Message-ID: CALdSSPgzhXTsY4UF-S2AsJwVBMvTSn1sB+Lc9jOcnQ8x3ebktg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 14 Apr 2025 at 22:15, Konstantin Osipov <kostja(dot)osipov(at)gmail(dot)com> wrote:
>
> Hi,

Hi

> I am considering starting work on implementing a built-in Raft
> replication for PostgreSQL.
>

Just some thought on top of my mind, if you need my voice here:

I have a hard time believing the community will be positive about this
change in-core. It has more changes as contrib extension. In fact, if
we want a built-in consensus algorithm, Paxos is a better option,
because you can use postgresql as local crash-safe storage for single
decree paxos, just store your state (ballot number, last voice) in a
heap table.
OTOH Raft needs to write its own log, and what's worse, it sometimes
needs to remove already written parts of it (so, it is not appended
only, unlike WAL). If you have a production system which maintains two
kinds of logs with different semantics, it is a very hard system to
maintain..

There is actually a prod-ready (non open source) implementation of
RAFT as extension, called BiHA, by pgpro.

Just some thought on top of my mind, if you need my voice here.

--
Best regards,
Kirill Reshke

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2025-04-14 17:44:40 Re: BitmapHeapScan streaming read user and prelim refactoring
Previous Message Tom Lane 2025-04-14 17:44:06 Re: Fundamental scheduling bug in parallel restore of partitioned tables