Re: Built-in Raft replication

From: Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru>
To: Kirill Reshke <reshkekirill(at)gmail(dot)com>, Konstantin Osipov <kostja(dot)osipov(at)gmail(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Built-in Raft replication
Date: 2025-04-15 08:57:24
Message-ID: 142418c4-5db1-4221-a0a4-cd8aa9a17e83@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

14.04.2025 20:44, Kirill Reshke пишет:
> OTOH Raft needs to write its own log, and what's worse, it sometimes
> needs to remove already written parts of it (so, it is not appended
> only, unlike WAL). If you have a production system which maintains two
> kinds of logs with different semantics, it is a very hard system to
> maintain..

Raft is log replication protocol which uses log position and term.
But... PostgreSQL already have log position and term in its WAL structure.
PostgreSQL's timeline is actually the Term.
Raft implementer needs just to correct rules for Term/Timeline switching:
- instead of "next TimeLine number is just increment of largest known
TimeLine number" it needs to be "next TimeLine number is the result of
Leader Election".

And yes, "it sometimes needs to remove already written parts of it".
But... It is exactly what every PostgreSQL's cluster manager software have
to do to join previous leader as a follower to new leader - pg_rewind.

So, PostgreSQL already have 70-90%% of Raft implementation details.
Raft doesn't have to be implemented in PostgreSQL.
Raft has to be finished!!!

PS: One of the biggest issues is forced snapshot on replica promotion. It
really slows down leader switch time. It looks like it is not really
needed, or some small workaround should be enough.

--
regards
Yura Sokolov aka funny-falcon

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2025-04-15 09:02:19 Re: minor error message enhancement in refuseDupeIndexAttach
Previous Message Srinath Reddy 2025-04-15 08:56:51 [Proposal] Add \dAt [AMPTRN [TBLPTRN]] to list tables by Table Access Method in psql