From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | Melanie Plageman <melanieplageman(at)gmail(dot)com> |
Cc: | Tomas Vondra <tomas(at)vondra(dot)me>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, John Naylor <johncnaylorls(at)gmail(dot)com> |
Subject: | Re: Parallel heap vacuum |
Date: | 2025-02-20 01:31:32 |
Message-ID: | CAD21AoAr+Jck6MAeYhTN50youuL=+z6Lt8a52Hrn_GuNRTDMPg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Feb 18, 2025 at 4:43 PM Melanie Plageman
<melanieplageman(at)gmail(dot)com> wrote:
>
> On Mon, Feb 17, 2025 at 1:11 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Fri, Feb 14, 2025 at 2:21 PM Melanie Plageman
> > <melanieplageman(at)gmail(dot)com> wrote:
> > >
> > > Since the failure rate is defined as a percent, couldn't we just have
> > > parallel workers set eager_scan_remaining_fails when they get their
> > > chunk assignment (as a percentage of their chunk size)? (I haven't
> > > looked at the code, so maybe this doesn't make sense).
> >
> > IIUC since the chunk size eventually becomes 1, we cannot simply just
> > have parallel workers set the failure rate to its assigned chunk.
>
> Yep. The ranges are too big (1-8192). The behavior would be too
> different from serial.
>
> > > Also, if you start with only doing parallelism for the third phase of
> > > heap vacuuming (second pass over the heap), this wouldn't be a problem
> > > because eager scanning only impacts the first phase.
> >
> > Right. I'm inclined to support only the second heap pass as the first
> > step. If we support parallelism only for the second pass, it cannot
> > help speed up freezing the entire table in emergency situations, but
> > it would be beneficial for cases where a big table have a large amount
> > of spread garbage.
> >
> > At least, I'm going to reorganize the patch set to support parallelism
> > for the second pass first and then the first heap pass.
>
> Makes sense.
I've attached the updated patches. In this version, I focused on
parallelizing only the second pass over the heap. It's more
straightforward than supporting the first pass, it still requires many
preliminary changes though.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
Attachment | Content-Type | Size |
---|---|---|
v9-0003-radixtree.h-support-shared-iteration.patch | application/octet-stream | 16.0 KB |
v9-0002-vacuumparallel.c-Support-parallel-table-vacuuming.patch | application/octet-stream | 21.4 KB |
v9-0004-tidstore.c-support-shared-iteration-on-TidStore.patch | application/octet-stream | 7.1 KB |
v9-0005-Move-some-fields-of-LVRelState-to-LVVacCounters-s.patch | application/octet-stream | 7.9 KB |
v9-0006-Support-parallelism-for-removing-dead-items-durin.patch | application/octet-stream | 18.7 KB |
v9-0001-Introduces-table-AM-APIs-for-parallel-table-vacuu.patch | application/octet-stream | 5.1 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2025-02-20 01:45:55 | Re: Fix logging for invalid recovery timeline |
Previous Message | Sami Imseih | 2025-02-20 01:04:41 | Re: Sample rate added to pg_stat_statements |