Re: replication primary writting infinite number of WAL files

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: Les <nagylzs(at)gmail(dot)com>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: replication primary writting infinite number of WAL files
Date: 2023-11-24 17:51:55
Message-ID: 430d83f0-65ff-47b0-9600-26094099fc17@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 11/24/23 09:32, Les wrote:

Please Reply All to include list
Ccing list to get information back there.

>
>
> Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com
> <mailto:adrian(dot)klaver(at)aklaver(dot)com>> (2023. nov. 24., P, 17:50):
>
> On 11/24/23 03:39, Les wrote:
> > The only exception is a sequence
> > value that was moved millions of steps within a single minute. Of
>
> Did you determine this by looking at select * from some_seq?
>
>
> select dd, (select max(id) from some_frequently_changed_table where
> created < dd) as id
> FROM generate_series
>         ( '2023-11-24 10:50'::timestamp
>         , '2023-11-22 10:30'::timestamp
>         , '-1 minute'::interval) dd
>         ;
>
> Here is a fragment from the first occasion:
>
> 2023-11-24 10:31:00.000|182920700600|
> 2023-11-24 10:30:00.000|182920700500|
> 2023-11-24 10:29:00.000|182920699900|
> 2023-11-24 10:28:00.000|182920699900|
> 2023-11-24 10:27:00.000|182920699900|
> 2023-11-24 10:26:00.000|182920663400|
> 2023-11-24 10:25:00.000|182920663400|
> 2023-11-24 10:24:00.000|176038405400|
> 2023-11-24 10:23:00.000|176038405400|
> 2023-11-24 10:22:00.000|176038405400|
> 2023-11-24 10:21:00.000|176038405400|
> 2023-11-24 10:20:00.000|169819538300|
> 2023-11-24 10:19:00.000|169819538300|
> 2023-11-24 10:18:00.000|169819538300|
> 2023-11-24 10:17:00.000|167912236800|
> 2023-11-24 10:16:00.000|164226477100|
> 2023-11-24 10:15:00.000|164226477100|
> 2023-11-24 10:14:00.000|153516704200|
> 2023-11-24 10:13:00.000|153516704200|
> 2023-11-24 10:12:00.000|153516704200|
> 2023-11-24 10:11:00.000|153516704200|
> 2023-11-24 10:10:00.000|153516704200|
> 2023-11-24 10:09:00.000|144613764500|
> 2023-11-24 10:08:00.000|144613764500|
> 2023-11-24 10:07:00.000|144613764500|
> 2023-11-24 10:06:00.000|144613764500|
> 2023-11-24 10:05:00.000|144312488400|
>
> Sequence is incremented by 100,  so for example, between 2023-11-24
> 10:20:00 and 2023-11-24 10:21:00 it went up 62188671 steps. I think it
> is not possible to insert 62188671 rows into a table. A psql function
> might be able to increment a sequence 62M times / minute, I'm not sure.
>
> On the second occasion, there these were the biggest:
>
> dd                     |id          |
> -----------------------+------------+
> 2023-11-24 10:50:00.000|182921196400|
> 2023-11-24 10:49:00.000|182921196400|
> 2023-11-24 10:48:00.000|182921196400|
> 2023-11-24 10:47:00.000|182921196400|
> 2023-11-24 10:46:00.000|182921192500|
> 2023-11-24 10:45:00.000|182921192500|
> 2023-11-24 10:44:00.000|182921192500|
> 2023-11-24 10:43:00.000|182921191900|
> 2023-11-24 10:42:00.000|182921191300|
> 2023-11-24 10:41:00.000|182921189900|
> 2023-11-24 10:40:00.000|182921189900|
> 2023-11-24 10:39:00.000|182921188100|
> 2023-11-24 10:38:00.000|182921188100|
> 2023-11-24 10:37:00.000|182921188100|
> 2023-11-24 10:36:00.000|182921188100|
> 2023-11-24 10:35:00.000|182920838600|
> 2023-11-24 10:34:00.000|182920838600|
> 2023-11-24 10:33:00.000|182920838600|
> 2023-11-24 10:32:00.000|182920838600|
> 2023-11-24 10:31:00.000|182920700600|
> 2023-11-24 10:30:00.000|182920700500|
>
>
>
> Are the servers open to the world and if so have you explored whether
> there has been an intrusion?
>
> They are not open to the world. We did not see any sign of intrusion,
> but of course this is possible.
>
> We are using dev databases that are created from snapshots of the
> standby. There is a possibility that a dev database instance (created
> from a snapshot of the standby) might have connected the primary just
> before it was reconfigured to be standalone. Can this be a problem?
>
>
> Do you have logs that cover the period from when it transitioned from
> working normally to going haywire?
>
> Yes. That log only contains messages saying that "checkpoints are
> happening too frequently", nothing else.
>
>
>
>
> You are using repmgr which as I understand it uses streaming not
> logical
> replication.
>
> Yes, we are using streaming replication.

--
Adrian Klaver
adrian(dot)klaver(at)aklaver(dot)com

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Les 2023-11-24 17:52:59 Re: replication primary writting infinite number of WAL files
Previous Message Laurenz Albe 2023-11-24 17:31:29 Re: replication primary writting infinite number of WAL files