Re: Parallel heap vacuum

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Parallel heap vacuum
Date: 2024-11-11 18:15:17
Message-ID: CAD21AoDAEsju0F3M7Lh5bp1DShNdYm49dcQTgNcE_j4v-t6oeQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 11, 2024 at 5:08 AM Hayato Kuroda (Fujitsu)
<kuroda(dot)hayato(at)fujitsu(dot)com> wrote:
>
> Dear Sawda-san,
>
> >
> > I've attached new version patches that fixes failures reported by
> > cfbot. I hope these changes make cfbot happy.
>
> Thanks for updating the patch and sorry for delaying the reply. I confirmed cfbot
> for Linux/Windows said ok.
> I'm still learning the feature so I can post only one comment :-(.
>
> I wanted to know whether TidStoreBeginIterateShared() was needed. IIUC, pre-existing API,
> TidStoreBeginIterate(), has already accepted the shared TidStore. The only difference
> is whether elog(ERROR) exists, but I wonder if it benefits others. Is there another
> reason that lazy_vacuum_heap_rel() uses TidStoreBeginIterateShared()?

TidStoreBeginIterateShared() is designed for multiple parallel workers
to iterate a shared TidStore. During an iteration, parallel workers
share the iteration state and iterate the underlying radix tree while
taking appropriate locks. Therefore, it's available only for a shared
TidStore. This is required to implement the parallel heap vacuum,
where multiple parallel workers do the iteration on the shared
TidStore.

On the other hand, TidStoreBeginIterate() is designed for a single
process to iterate a TidStore. It accepts even a shared TidStore as
you mentioned, but during an iteration there is no inter-process
coordination such as locking. When it comes to parallel vacuum,
supporting TidStoreBeginIterate() on a shared TidStore is necessary to
cover the case where we use only parallel index vacuum but not
parallel heap scan/vacuum. In this case, we need to store dead tuple
TIDs on the shared TidStore during heap scan so parallel workers can
use it during index vacuum. But it's not necessary to use
TidStoreBeginIterateShared() because only one (leader) process does
heap vacuum.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2024-11-11 18:15:30 Re: [PoC] XMLCast (SQL/XML X025)
Previous Message Andrey M. Borodin 2024-11-11 18:03:36 Re: [PATCH] Add sortsupport for range types and btree_gist