Re: BUG #17103: WAL segments are not removed after exceeding max_slot_wal_keep_size

From: Marcin Krupowicz <marcin(at)071(dot)ovh>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org, horikyota(dot)ntt(at)gmail(dot)com
Subject: Re: BUG #17103: WAL segments are not removed after exceeding max_slot_wal_keep_size
Date: 2021-07-14 07:52:34
Message-ID: CAK1Niayp_04JSEpSa+4HJihacOmNkvYsAwp-y+QWZ19ynyu2TA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi,

> > We are using max_slot_wal_keep_size from Postgresql 13 to prevent master
> > from being killed by a lagging replication. It seems, that in our case, WAL
> > storage wasn't freed up after exceeding this parameter which resulted in a
> > replication failure. WAL which, as I believe, should have been freed up did
> > not seem to be needed by any other transaction at a time.
>
> Yeah, the max_slot_wal_keep_size is the maximum WAL size that
> replication slots are guaranteed to be able to keep files up to. It
> is not the size that replication slot are guaranteed not to keep WAL
> files beyond it. Addition to that, WAL removal happens only at the
> ending of a checkpoint so WAL files can grow up to
> max_slot_wal_keep_size plus checkpoint distance assuming an even load.

I understand, but the situation lasted for many hours, until my manual
reboot. checkpoint timeout is 5 minutes, there were many checkpoints
between the time when the slot got lost (exceeding
max_slow_wal_keep_size) and my manual reboot. During all that time
load was fairly even and the amount of WAL segments stored wasn't
changing much.

> > -- Configuration --
> > master & one replica - streaming replication using a slot
> > ~700GB available for pg_wal
> > max_slot_wal_keep_size = 600GB
> > min_wal_size = 20GB
> > max_wal_size = 40GB
> > default checkpoint_timeout = 5 minutes (no problem with checkpoints)
> > archiving is on and is catching up well
>
> Assuming an even load (or WAL speed) and 0.5 for
> checkpoint_completion_target, 40GB of max_wal_size causes checkpoints
> every 27GB (1706 segments) (*1) at longest (in the case where xlog
> checkpoint fires before timeout checkpoint).
>
> Thus with 600GB of max_slot_wal_keep_size, the maximum size of WAL
> files can reach 627GB, which size can even be exceeded if a sudden
> high-load is given.
>
> [1] checkpoint distance = max_wal_size / (1.0 + checkpoint_completion_target)

Fair point, I should change my settings slightly - but that's not the
issue here.

>
> > -- What happened --
> > Under heavy load (large COPY/INSERT transactions, loading hundreds of GB of
> > data), the replication started falling behind. Available space on pg_wal was
> > being reduced in the same rate as safe_slot
> > pg_replication_slot.safe_wal_size - as expected. At some point safe_wal_size
> > went negative and streaming stopped working. It wasn't a problem, because
> > replica started recovery from WAL archive. I expected that once the slot is
> > lost, WALs will be removed up to max_wal_size. This did not happen though.
> > It seems that Postgres tried to maintain something close to
> > max_slot_wal_keep_size (600GB) available, in case replica starts catching up
> > again. Over the time, there was no single transaction which would require
> > this much WAL to be kept. archiving wasn't behind either.
>
> Useless WAL files will be removd after a checkpoint runs.

That did not happen.

> > Amount of free space on pg_wal was more or less 70GB for most of the time,
> > however at some point, during heavy autovacuuming, it dipped to 0 :( This is
> > when PG crashed and (auto-recovered soon after). After getting back up,
> > there was 11GB left on pg_wal and no transaction running, no loading. This
> > lasted for hours. During this time replica finally caught up from the
> > archive and restored the replication with no delay. None of the WALs were
> > removed. I manually run checkpoint but it did not clear any WALs. I finally
> > restarted Postgresql and during the restarting pg_wal were finally
> > cleared.
> >
> > Again - why PG did not clear WAL? WALs, even more clearly, were not needed
> > by any process.
>
> Maybe manual CHECKPINT work for you , however, you should reconsider
> the setting assuming the above behavior to prevent a crash due to WAL
> storage exhaustion.

Sorry, I'm confused. I did run manual CHECKPOINT (even though there
were many, many non-manual checkpoints run before that) and WAL
segments were NOT cleared, until I restarted postgresql.

Thanks,
-- Marcin

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Magnus Hagander 2021-07-14 09:00:47 Re: BUG #17104: memory leak
Previous Message PG Bug reporting form 2021-07-14 07:18:04 BUG #17105: Repo-file pgdg-redhat-all.repo contains invalid option "failovermethod"