Re: Questions on logical replication

From: Kashif Zeeshan <kashi(dot)zeeshan(at)gmail(dot)com>
To: Koen De Groote <kdg(dot)dev(at)gmail(dot)com>
Cc: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>, PostgreSQL General <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Questions on logical replication
Date: 2024-06-07 04:20:30
Message-ID: CAAPsdhd4y2z7F0uHsw3ifEYGexu=Pyj7DFQQZ5FQtdoghsgLWA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Fri, Jun 7, 2024 at 3:19 AM Koen De Groote <kdg(dot)dev(at)gmail(dot)com> wrote:

> I'll give them a read, though it might take a few weekends
>
> Meanwhile, this seems to be what I'm looking for:
>
> From
> https://www.postgresql.org/docs/current/warm-standby.html#STREAMING-REPLICATION-SLOTS
>
> " Replication slots provide an automated way to ensure that the primary
> does not remove WAL segments until they have been received by all standbys,
> and that the primary does not remove rows which could cause a recovery
> conflict
> <https://www.postgresql.org/docs/current/hot-standby.html#HOT-STANDBY-CONFLICT>
> even when the standby is disconnected."
>
> I'm reading that as: "if there is a replication slot, if the standby is
> disconnected, WAL is kept"
>
> And if we know WAL is kept in the "pg_wal" directory, that sounds like it
> could slowly but surely fill up disk space.
>

Hi

Yes that is a consideration with logical replication but the possible cast
out weight the benefit.
The kept WAL file size will only increase if the standby is offline.

Regards
Kashif Zeeshan
Bitnine Global

>
>
> But again, I'll give them a read. I've read all of logical replication
> already, and I feel like I didn't get my answer there.
>
> Thanks for the help
>
>
> Regards,
> Koen De Groote
>
> On Thu, Jun 6, 2024 at 12:19 AM Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
> wrote:
>
>> On 6/5/24 14:54, Koen De Groote wrote:
>> > https://www.postgresql.org/docs/current/wal-configuration.html
>> > <https://www.postgresql.org/docs/current/wal-configuration.html>
>> >
>> > "Checkpoints are points in the sequence of transactions at which it
>> is
>> > guaranteed that the heap and index data files have been updated with
>> > all
>> > information written before that checkpoint. At checkpoint time, all
>> > dirty data pages are flushed to disk and a special checkpoint
>> record is
>> > written to the WAL file. (The change records were previously
>> flushed to
>> > the WAL files.) In the event of a crash, the crash recovery
>> procedure
>> > looks at the latest checkpoint record to determine the point in the
>> WAL
>> > (known as the redo record) from which it should start the REDO
>> > operation. Any changes made to data files before that point are
>> > guaranteed to be already on disk. Hence, after a checkpoint, WAL
>> > segments preceding the one containing the redo record are no longer
>> > needed and can be recycled or removed. (When WAL archiving is being
>> > done, the WAL segments must be archived before being recycled or
>> > removed.)"
>> >
>> >
>> > And this is the same for logical replication and physical replication,
>> I
>> > take it.
>>
>> High level explanation, both physical and logical replication use the
>> WAL files as the starting point. When the recycling is done is dependent
>> on various factors. My suggestion would be to read through the below to
>> get a better idea of what is going. There is a lot to cover, but if you
>> really want to understand it you will need to go through it.
>>
>> Physical replication
>>
>> https://www.postgresql.org/docs/current/high-availability.html
>>
>> 27.2.5. Streaming Replication
>> 27.2.6. Replication Slots
>>
>> Logical replication
>>
>> https://www.postgresql.org/docs/current/logical-replication.html
>>
>> WAL
>>
>> https://www.postgresql.org/docs/current/wal.html
>>
>>
>>
>> >
>> > Thus, if a leader has a standby of the same version, and meanwhile
>> > logical replication is being done to a newer version, both those
>> > replications are taken into account, is that correct?
>>
>> Yes, see links above.
>>
>>
>> > And if it cannot sync them, due to connectivity loss for instance, the
>> > WAL records will not be removed, then?
>>
>> Depends on the type of replication being done. It is possible for
>> physical replication to have WAL records removed that are still needed
>> downstream.
>>
>> From
>>
>>
>> https://www.postgresql.org/docs/current/warm-standby.html#STREAMING-REPLICATION
>>
>> "If you use streaming replication without file-based continuous
>> archiving, the server might recycle old WAL segments before the standby
>> has received them. If this occurs, the standby will need to be
>> reinitialized from a new base backup. You can avoid this by setting
>> wal_keep_size to a value large enough to ensure that WAL segments are
>> not recycled too early, or by configuring a replication slot for the
>> standby. If you set up a WAL archive that's accessible from the standby,
>> these solutions are not required, since the standby can always use the
>> archive to catch up provided it retains enough segments."
>>
>> This is why it is good idea to go through the links I posted above.
>>
>> >
>> > Regards,
>> > Koen De Groote
>> >
>>
>>
>> --
>> Adrian Klaver
>> adrian(dot)klaver(at)aklaver(dot)com
>>
>>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Kashif Zeeshan 2024-06-07 04:27:07 Re: PG 14 pg_basebackup accepts --compress=server-zst option
Previous Message Ron Johnson 2024-06-07 01:54:20 PG 14 pg_basebackup accepts --compress=server-zst option