Quick Links

Re: Eagerly scan all-visible pages to amortize aggressive vacuum

From:	Melanie Plageman <melanieplageman(at)gmail(dot)com>
To:	Robert Haas <robertmhaas(at)gmail(dot)com>
Cc:	Robert Treat <rob(at)xzilla(dot)net>, Marcos Pegoraro <marcos(at)f10(dot)com(dot)br>, Alena Rybakina <a(dot)rybakina(at)postgrespro(dot)ru>, Andres Freund <andres(at)anarazel(dot)de>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Geoghegan <pg(at)bowt(dot)ie>
Subject:	Re: Eagerly scan all-visible pages to amortize aggressive vacuum
Date:	2025-01-29 16:34:55
Message-ID:	CAAKRu_aiDH6=HSaNCGmG9PFS4Vw-fMbVdF6XggM5Eqyz_=tLJQ@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Fri, Jan 24, 2025 at 11:20 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Fri, Jan 24, 2025 at 9:15 AM Melanie Plageman
> <melanieplageman(at)gmail(dot)com> wrote:
> > So, in this case, there is only one table in question, so 1 autovacuum
> > worker (and up to 2 maintenance parallel workers for index vacuuming).
> > The duration I provided is just the absolute duration from start of
> > vacuum to finish -- not considering the amount of time each parallel
> > worker may have been working (also it includes time spent delaying).
> > The benchmark ran for 2.8 hours. I configured vacuum to run
> > frequently. In this case, master spent 47% of the total time vacuuming
> > and the patch spent 56%.
>
> Definitely not insignificant, but I think it's OK for a worst case.
> Autovacuum is a background process, so it's not like a 20% regression
> on query performance.

So, I've done a few runs with FPIs turned off to reduce run variance
caused by vacuum and checkpoint timing.
Of course this means that the amount of IO done by vacuum is very
different from a benchmark run with realistic settings.

I reran two of my simulations:

1)
- hot tail
32 clients inserting 20 rows then updating 1 row
duration: 3 hours

There is a small increase in total time spent vacuuming (< 10%). But
it is spread out. The first aggressive vacuum of the table is 20
seconds with the patch and 9 minutes on master. And this is not an
append-only workload -- the tail of the table (up to 200,000 rows old)
is being updated (and potentially unfrozen). So, this feels like a
win.

The insert/update P99 latency is lower (better) with the patch around
the time of the aggressive vacuum.

2)
- hot tail with delete (worst-case)
32 clients inserting 20 rows then updating 1 row and 1
rate-limited client deleting all data before it can be aggressively
vacuumed
durations: 3 hours

There is a 10-15% increase in total time spent vacuuming with the
patch (30-40% of total benchmark runtime is spent vacuuming).

I ran the benchmark for 4 hours as well, and for that duration I
started to see a larger increase in vacuum IO time with the patch.
However, the 4 hour run had only one aggressive vacuum (around the 2.5
hour mark), so the numbers are hard to compare because the patch is
meant to do some of the work of the next aggressive vacuum in advance.

The insert/update P99 latency is the same or lower (better) with the patch.

Next I plan to run the hottail delete benchmark with default settings
(including FPIs) with master and with the patch for about 24 hours
each. I'm hoping the long duration will smooth out some of the run
variance even with FPIs.

- Melanie

In response to

Re: Eagerly scan all-visible pages to amortize aggressive vacuum at 2025-01-24 16:20:30 from Robert Haas

Responses

Re: Eagerly scan all-visible pages to amortize aggressive vacuum at 2025-02-03 17:19:32 from Melanie Plageman

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Benoit Lobréau	2025-01-29 16:41:45	Re: Logging parallel worker draught
Previous Message	Mahendra Singh Thalor	2025-01-29 16:24:58	getting "shell command argument contains a newline or carriage return:" error with pg_dumpall when db name have new line in double quote