Re: replication primary writting infinite number of WAL files

From: Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>
To: Les <nagylzs(at)gmail(dot)com>
Cc: pgsql-general(at)lists(dot)postgresql(dot)org
Subject: Re: replication primary writting infinite number of WAL files
Date: 2023-11-24 17:31:29
Message-ID: 6097c54582a91b40eaab2d3d24152d98fcaf998f.camel@cybertec.at
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Fri, 2023-11-24 at 16:59 +0100, Les wrote:
>
>
> Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at>  (2023. nov. 24., P, 16:00):
> > On Fri, 2023-11-24 at 12:39 +0100, Les wrote:
> > > Under normal circumstances, the number of write operations is relatively low, with an
> > > average of 4-5 MB/sec total write speed on the disk associated with the data directory.
> > > Yesterday, the primary server suddenly started writing to the pg_wal directory at a
> > > crazy pace, 1.5GB/sec, but sometimes it went up to over 3GB/sec.
> > > [...]
> > > Upon further analysis of the database, we found that we did not see any mass data
> > > changes in any of the tables. The only exception is a sequence value that was moved
> > > millions of steps within a single minute.
> >
> > That looks like some application went crazy and inserted millions of rows, but the
> > inserts were rolled back.  But it is hard to be certain with the clues given.
>
> Writing of WAL files continued after we shut down all clients, and restarted the primary PostgreSQL server.
>
> How can the primary server generate more and more WAL files (writes) after all clients have
> been shut down and the server was restarted? My only bet was the autovacuum. But I ruled
> that out, because removing a replication slot has no effect on the autovacuum (am I wrong?).

It must have been autovacuum. Removing a replication slot has an influence, since then
autovacuum can do more work. If the problem stopped when you dropped the replication slot,
it could be a coincidence.

> Now you are saying that this looks like a huge rollback.

It could have been many small rollbacks.

> Does rolling back changes require even more data to be written to the WAL after server
> restart?

No. My assumption would be that something generated lots of INSERTs that were all
rolled back. That creates WAL, even though you see no change in the table data.

> Does removing a replication slot lessen the amount of data needed to be written for
> a rollback (or for anything else)?

No: the WAL is generated by whatever precedes the ROLLBACK, and the ROLLBACK does
not create a lot of WAL.

> It is a fact that the primary stopped writing at 1.5GB/sec the moment we removed the slot.

I have no explanation for that, except a coincidence.
Replication slots don't generate WAL.

Yours,
Laurenz Albe

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2023-11-24 17:51:55 Re: replication primary writting infinite number of WAL files
Previous Message Adrian Klaver 2023-11-24 16:52:25 Re: Inquiry Regarding Initial Seed for pgsql Protocol Fuzz Testing