Re: replication primary writting infinite number of WAL files

From: Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
To: Les <nagylzs(at)gmail(dot)com>
Cc: pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: replication primary writting infinite number of WAL files
Date: 2023-11-24 21:37:56
Message-ID: 592a4e9d-0d12-40ac-ad2c-0ddc4caa760a@aklaver.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


On 11/24/23 1:05 PM, Les wrote:
>
>
> >> Sequence is incremented by 100,  so for example, between
> 2023-11-24
> >> 10:20:00 and 2023-11-24 10:21:00 it went up 62188671 steps. I
> think it
> >> is not possible to insert 62188671 rows into a table. A psql
> function
> >> might be able to increment a sequence 62M times / minute, I'm
> not sure.
>
> Am I correct in assuming id has as it's default
> nextval(<the_sequence>)?
>
> All primary keys are "id int8 NOT NULL". They don't have a default,
> but all of these identifiers are generated from the same (global)
> sequence. We are using a single global sequence, because there are
> only a few writes. 99% of I/O operations are read.

A single sequence for all id columns across all tables?

How is the sequence value landing in the id column?

>
> If so it would seem to me something was doing a lot of INSERTS
> between
> 2023-11-24 10:20:00.000 and 2023-11-24 10:21:00.000.
>
> Yes. But is it possible to insert 62M rows within 1 minute?

Is has not been determined yet that 62 million inserts where done, just
that the sequence jumped by that number of steps.

>
> It might be possible to "select nextval" 1M times per second from an
> Pl/SQL function that does nothing else in an infinite loop. But that
> would not write too much data on the disk.

On an old laptop:

create table insert_test(id int);

insert into insert_test select val from generate_series(1, 1000000) as
t(val);
INSERT 0 1000000
Time: 943.918 ms

I would  say there is more then just the id being inserted, unless all
the other fields allow NULL.

>
> And there is nothing in the logs in that time period besides
> "checkpoints are happening too frequently"?
>
>
>
> The "set transaction" warning is due to a bug in an application that
> calls SET TRANSACTION before starting a transaction.

And what is that app doing when it does SET TRANSACTION?

>
>
> https://www.postgresql.org/docs/current/runtime-config-logging.html#RUNTIME-CONFIG-LOGGING-WHAT
>
> log_statement
>
> set to at least 'mod'?
>
> Unfortunately we don't, but now that you asked, we will turn this on.
>
>
>
> >> We are using dev databases that are created from snapshots of the
> >> standby. There is a possibility that a dev database instance
> (created
> >> from a snapshot of the standby) might have connected the
> primary just
> >> before it was reconfigured to be standalone. Can this be a problem?
>
> Was your original report for the dev databases also?
>
> No, every log and metric was for the prod db.

So how do the dev databases enter into this"

>
> How are the snapshots being taken?
>
> zfs snapshot is taken on the standby, then zfs clone is created on the
> snapshot, and a new postgresql instance is started on top of the
> clone. It recovers within one minute. In a very few cases (two or
> three times in a year), it fails to recover. Then we use a different
> snapshot. When the dev instance starts up, then we immediately delete
> everything from repmgr nodes, and disable repmgr completely. Today we
> noticed that the dev db was created in a network that made it possible
> to connect to the prod primary. (dev db's network was not separated
> from the prod db's network, fixed today). The dev db might have
> connected to the prod/primary after startup. But that dev instance was
> created 7 days ago, so probably it is not related.

What I know about ZFS would fit in the navel of flea, so someone else
will have to comment on this.

Dev connected to prod/primary how?

>
> Actually, we need this standby because we are using zfs snapshots for
> making frequent database backups. Without this standby, we can't make
> frequent backups. (The primary and the other standby are not using zfs.)
>
> Thanks,
>
>     Laszlo
>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Ron Johnson 2023-11-24 21:39:57 Re: pg_getnameinfo_all() failed: Temporary failure in name resolution
Previous Message David G. Johnston 2023-11-24 21:26:14 Re: pg_getnameinfo_all() failed: Temporary failure in name resolution