Re: Parallel heap vacuum

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Melanie Plageman <melanieplageman(at)gmail(dot)com>, John Naylor <johncnaylorls(at)gmail(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel heap vacuum
Date: 2025-03-11 12:51:07
Message-ID: CAA4eK1Ji9bX7wXXQsQKNvudipWUb6F7iwGa0L3Jhcd7ri8Gy-w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 11, 2025 at 5:00 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Sun, Mar 9, 2025 at 11:28 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> >
> > Does phase 3 also use parallelism? If so, can we try to divide the
> > ring buffers among workers or at least try vacuum with an increased
> > number of ring buffers. This would be good to do for both the phases,
> > if they both use parallelism.
>
> No, only phase 1 was parallelized in this test. In parallel vacuum,
> since it uses (ring_buffer_size * parallel_degree) memory, more pages
> are loaded during phase 1, increasing cache hits during phase 3.
>

Shouldn't we ideally try with a vacuum without parallelism with
ring_buffer_size * parallel_degree to make the comparison better?
Also, what could be the reason for the variation in data of phase-I?
Do you restart the system after each run to ensure there is nothing in
the memory? If not, then shouldn't we try at least a few runs by
restarting the system before each run to ensure there is nothing
leftover in memory?

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message jian he 2025-03-11 12:53:14 Re: speedup COPY TO for partitioned table.
Previous Message Andy Fan 2025-03-11 12:38:38 a pool for parallel worker