Re: PostgreSQL logical replication depends on WAL segments?

From: Josef Machytka <josef(dot)machytka(at)gmail(dot)com>
To: Jeremy Finzel <finzelj(at)gmail(dot)com>
Cc: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>, Postgres General <pgsql-general(at)postgresql(dot)org>
Subject: Re: PostgreSQL logical replication depends on WAL segments?
Date: 2019-01-23 15:05:32
Message-ID: CAGvVEFvvSQb9BYHxHGnGAK3C_kAjORO-3fR6W7jKcy0DrbF7og@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Thanks, I see... So if I understand it correctly - since I have quite big
>> partitions like ~30 GB each in one parent table and from ~1GB to ~5 GB in
>> several others I presume I had to set wal_keep_segments to some really high
>> number and stop our security cronjob cleaning old WAL segments (because we
>> already had some problems with almost full disk due to old WAL segments)
>> until the whole transfer of snapshot is done. Because only after the whole
>> snapshot is transferred logical replication workers start to transfer WAL
>> logs reflecting changes done from the moment snapshot was taken...
>>
>> jm
>>
>
> Understand there are other downsides to just keeping around a huge amount
> of WAL segments apart from only taking up disk space. None of the data
> held in those WAL segments can be vacuumed away while they are left around,
> which can lead to significant bloat and performance issues over time.
>
> I'm not exactly clear on your use case, but if you need to just
> resychronize data for a single table, there is a built-in way to do that
> (actually would be nice if the docs spelled this out).
>
> On publisher:
>
> ALTER PUBLICATION mypub DROP TABLE old_data_table;
>
> On subscriber:
>
> ALTER SUBSCRIPTION mysub REFRESH PUBLICATION WITH ( COPY_DATA = true);
>
> On publisher:
>
> ALTER PUBLICATION mypub ADD TABLE old_data_table;
>
> On subscriber:
>
> ALTER SUBSCRIPTION mysub REFRESH PUBLICATION WITH ( COPY_DATA = true);
>
>
Many thanks Jeremy, I changed my implementation based on your
recommendation and now it works like a charm for all available tables.

> The last command will resync the table from the current table data,
> regardless of the WAL file situation. This is the "normal" way you would
> go about resynchronizing data between clusters when a long time has passed,
> rather than trying to keep all that WAL around!
>
> So far as I can tell from testing, above pattern is the easiest way to do
> this, and it will not resynchronize any of the other tables in your
> subscription.
>
> P.S. do heed the advice of the others and get more familiar with the docs
> around WAL archiving.
>
> Thanks,
> Jeremy
>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Alban Hertroys 2019-01-23 16:22:23 Betr: Re: Reclaiming space for dropped database
Previous Message Tom Lane 2019-01-23 15:02:01 Re: Reclaiming space for dropped database