Re: Long running backup preventing auto vacuum

From: Nikhil Shetty <nikhil(dot)dba04(at)gmail(dot)com>
To: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
Cc: Pgsql-admin <pgsql-admin(at)lists(dot)postgresql(dot)org>
Subject: Re: Long running backup preventing auto vacuum
Date: 2022-07-01 08:31:56
Message-ID: CAFpL5Vw9z_S+rNUiYhozrBJMznzr-Jv5DDDSTNuFpc=f4mcaoA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

Hi Andrey,

One more thing to add is that, vacuum will not run for catalog tables
because of long running pg_stop_backup query

Query:

datname | pid | query_start | state | runtime
| query

----------------------+-------+---------+------------------+-------------+-----------------+-------------------------------+-----------------------------

postgres | 15088 | 2022-07-01 05:56:22.729101+00 | active |
02:12:56.255367 | SELECT labelfile, spcmapfile, lsn FROM
pg_stop_backup(false)

Example Vacuum:

<dbname>=# vacuum verbose analyze pg_database;

INFO: vacuuming "pg_catalog.pg_database"

INFO: "pg_database": found 0 removable, 5 nonremovable row versions in 1
out of 1 pages

DETAIL: 0 dead row versions cannot be removed yet, oldest xmin: 996487475

There were 0 unused item pointers.

Skipped 0 pages due to buffer pins, 0 frozen pages.

0 pages are entirely empty.

CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.

INFO: analyzing "pg_catalog.pg_database"

INFO: "pg_database": scanned 1 of 1 pages, containing 5 live rows and 0
dead rows; 5 rows in sample, 5 estimated total rows

VACUUM

Autovacuum from logs of previous issue (timestamp does not match because
this is taken from last week logs):

022-06-26 00:00:21 UTC [32632]: [2-1] user=,db=,app=,client= HINT: Close
open transactions soon to avoid wraparound problems.

You might also need to commit or roll back old prepared transactions, or
drop stale replication slots.

2022-06-26 00:00:21 UTC [32632]: [3-1] user=,db=,app=,client= LOG: automatic
aggressive vacuum of table "postgres.pg_catalog.pg_statistic": index scans:
0

pages: 0 removed, 40 remain, 0 skipped due to pins, 0 skipped frozen

tuples: 0 removed, 1064 remain, 577 are dead but not yet removable, oldest
xmin: 4200201112

buffer usage: 57 hits, 43 misses, 0 dirtied

avg read rate: 203.846 MB/s, avg write rate: 0.000 MB/s

system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s

2022-06-26 00:00:21 UTC [32632]: [4-1] user=,db=,app=,client= WARNING: oldest
xmin is far in the past

2022-06-26 00:00:21 UTC [32632]: [5-1] user=,db=,app=,client= HINT: Close
open transactions soon to avoid wraparound problems.

Both xmin point to the long running pg_stop_backup. Not quite sure, but
would this cause autovacuum to not kick-in on application tables?

Thanks,

Nikhil

On Fri, Jul 1, 2022 at 1:11 PM Nikhil Shetty <nikhil(dot)dba04(at)gmail(dot)com> wrote:

> Hi Andrey,
>
> ./wal-g7 --version
>
> wal-g version v1.1.2-rc 6af461f 2021.12.14_08:19:15 PostgreSQL
>
> The only probable problem I can imagine is that we open transaction during
>> pg_stop_backup() [0] to set a statement_timeout
>
> Do you think this will prevent the autovacuum from kicking in ?
>
> I think this tx can be safely removed
>
> How can I remove this? I can test whether this is the problem
>
> Thanks,
> Nikhil
>
> On Thu, Jun 30, 2022 at 5:18 PM Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
> wrote:
>
>> Hi!
>>
>> > On 30 Jun 2022, at 14:23, Nikhil Shetty <nikhil(dot)dba04(at)gmail(dot)com> wrote:
>> >
>> > Hi,
>> >
>> > PostgreSQL version - 11.7
>> >
>> > We are using wal-g for backup/restore. We had a duration of high WAL
>> generation due to application usage due to which archive backup was
>> delayed.
>> >
>> > Backup was started through cron schedule and it didn't complete because
>> it waits for the last WAL to be archived. Now, I know that long running
>> transactions will prevent vacuuming of dead tuples but in this case we were
>> able to run manual vacuums successfully but auto-vacuum never starts.
>> >
>> > As soon as we terminated the long running backup session, the
>> auto-vacuum started kicking in.
>> >
>> > Wanted to understand why a backup operation is blocking the auto-vacuum?
>> >
>> > Wal-g uses non-exclusive backups and waits for the last wal to be
>> archived.
>>
>>
>> As far as I know, WAL-G should not prevent running autovacuums on tables.
>> WAL-G can be run against replication standby, where vacuum is not possible
>> at all, so there must be no conflicts...
>>
>> The only probable problem I can imagine is that we open transaction
>> during pg_stop_backup() [0] to set a statement_timeout. I think this is
>> superflous and unneeded. I think this tx can be safely removed. But anyway,
>> this tx does not have a xid (otherwise it could not run on Standby).
>>
>> BTW which version of WAL-G do you use?
>>
>> Best regards, Andrey Borodin.
>>
>> [0]
>> https://github.com/wal-g/wal-g/blob/8b9b3ca06efb44be4724208dc6f6d11836bd34d9/internal/databases/postgres/queryRunner.go#L243
>
>

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message Andrey Borodin 2022-07-01 08:44:45 Re: Long running backup preventing auto vacuum
Previous Message Nikhil Shetty 2022-07-01 07:41:38 Re: Long running backup preventing auto vacuum