Re: autovacuum failing on pg_largeobject and disk usage of the pg_largeobject growing unchecked

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Jim Hurne <jhurne(at)us(dot)ibm(dot)com>
Cc: PostgreSQL General <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: autovacuum failing on pg_largeobject and disk usage of the pg_largeobject growing unchecked
Date: 2020-06-22 20:44:33
Message-ID: CABUevEwbDJBR0ZpGxfde9bg6RPNcGGPQFQjudjpxtVBUdC-GYg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Mon, Jun 22, 2020 at 10:01 PM Jim Hurne <jhurne(at)us(dot)ibm(dot)com> wrote:

> We're still struggling to figure out why autovacuum seems to be failing or
> is ineffective sometimes against the pg_largeobject table.
>
> We decided to try a VACUUM FULL VERBOSE on one of our problematic
> databases. It did complete, but it took a surprisingly long time:
>
> INFO: vacuuming "pg_catalog.pg_largeobject"
> INFO: "pg_largeobject": found 97326130 removable, 22230 nonremovable
> row versions in 207508868 pages
> DETAIL: 0 dead row versions cannot be removed yet.
> CPU: user: 1085.87 s, system: 3803.66 s, elapsed: 10883.94 s.
> VACUUM
>
> So it took about 3 hours. It's surprising because there was only 58 MB of
> undeleted data in about 65 large objects. When we ran this particular
> vacuum, we made sure there were no other processes connected to the DB and
> that no other statements were running that would have locked the table.
> It's my understanding that a VACUUM FULL works by creating a brand new
> table and copying the "non-deleted" data into the new table. If that's the
> case, I would have expected it to complete very quickly, given how little
> "non-deleted" data existed at the time.
>

Yes, that's how VACUUM FULL works (more or less).

It had to read 207508868 pages, which is about 1.5TB worth of data
(207508868 pages of 8k), in order to know it didn't need it. That's
probably what took time. Depending on the performance of the machine, it
does not seem unreasonable. (The problem, of course, being that it got to
this big size with that little actual useful data in the table)

And for autovacuum, with a cost_delay of 20ms and a cost_limit of 200,
autovacuum would spend about 55 hours just on vacuum delay for the reads
(assuming all are cache failures and thus cost 10 "units", but that's
probably close enough to give you an idea) if my math isn't off
(reading 207508868 pages would then trigger the cost limit 207508868/20
times, and sleep 20ms each of those times).

We are of course going to continue to try different things, but does
> anyone have any other suggestions on what we should be looking at or what
> settings we might want to adjust?
>

The logs you posted originally seem to be excluding the actual autovacuum
details -- can you include those? That is, you are only including the very
last row of the log message, but the interesting parts are at the beginning.

I assume you've also looked for other autovacuum messages in the log --
such as it being canceled by concurrent activity?

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/>
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message AC Gomez 2020-06-22 21:07:48 Can the current session be notified and refreshed with a new credentials context?
Previous Message Tom Lane 2020-06-22 20:41:17 Re: DISTINCT on jsonb fields and Indexes