Quick Links

Re: Parallel CREATE INDEX for GIN indexes

From:	Tomas Vondra <tomas(at)vondra(dot)me>
To:	Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc:	Kirill Reshke <reshkekirill(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject:	Re: Parallel CREATE INDEX for GIN indexes
Date:	2025-02-16 03:47:10
Message-ID:	d93e3428-45a1-46e5-8a89-41480257c169@vondra.me
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

Attached is a cleaned up version of the patch series, squashed into
fewer patches as discussed. I also went through all the comments, and
removed/updated some obsolete ones. I also updated the commit messages,
it'd be nice if someone could read through those, to make sure it's
clear enough.

While cleaning the comments, I realized there's a couple remaining XXX
and FIXME comments, with some valid open questions.

1) There are two places that explicitly zero memory, suggesting it's
because of padding causing issues in valgrind (in tuplesort). I need to
check if that's still true, but I wonder what do the other tuplesort
variants write stuff without tripping valgrind. Maybe the GinTuple is
too unique.

2) ginBuildCallbackParallel says this about memory limits:

* XXX It might seem this should set the memory limit to 32MB, same as
* what plan_create_index_workers() uses to calculate the number of
* parallel workers, but that's the limit for tuplesort. So it seems
* better to keep using work_mem here.
*
* XXX But maybe we should calculate this as a per-worker fraction of
* maintenance_work_mem. It's weird to use work_mem here, in a clearly
* maintenance command.

The function uses work_mem to limit the amount of memory used by each
worker, which seems a bit strange - it's a maintenance operation, so it
would be more appropriate to use maintenance_work_mem I guess.

I see the btree code also uses work_mem in some cases when building the
index, although that uses it to size the tuplesort. And here we have
both the tuplesorts (sized just like in nbtree code), but also the
buffer used to accumulate entries.

I wonder if maybe the right solution would be to use half the allowance
for tuplesort and half for the buffer. In the workers the allowance is

maintenance_work_mem / ginleader->nparticipanttuplesorts

while in the leader it's maintenance_work_mem. Opinions?

3) There's a XXX comment suggesting to use a separate memory context for
the GinBuffer, but I decided it doesn't seem really necessary. We're not
running any complex function or anything like that in this code, so I
don't see a huge benefit of a separate context.

I know the patch reworking this to use a single tuplesort actually adds
the memory context, maybe it's helpful for that patch. But for now I
don't see the point.

4) The patch saving 12B in the GinTuple also added this comment:

* XXX: Update description with new architecture

but I'm a bit unsure what exactly is meant but "architecture" or what
should I add a description for.

regards

--
Tomas Vondra

Attachment	Content-Type	Size
v20250216-0001-Allow-parallel-CREATE-INDEX-for-GIN-indexe.patch	text/x-patch	68.8 KB
v20250216-0002-Compress-TID-lists-when-writing-GIN-tuples.patch	text/x-patch	8.2 KB
v20250216-0003-Enforce-memory-limit-during-parallel-GIN-b.patch	text/x-patch	12.3 KB
v20250216-0004-Use-a-single-GIN-tuplesort.patch	text/x-patch	32.1 KB
v20250216-0005-WIP-parallel-inserts-into-GIN-index.patch	text/x-patch	17.5 KB

In response to

Re: Parallel CREATE INDEX for GIN indexes at 2025-02-12 17:52:50 from Tomas Vondra

Responses

Re: Parallel CREATE INDEX for GIN indexes at 2025-02-17 13:01:09 from Matthias van de Meent

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tomas Vondra	2025-02-16 04:02:16	Re: Parallel CREATE INDEX for GIN indexes
Previous Message	Tomas Vondra	2025-02-16 01:15:15	Re: BitmapHeapScan streaming read user and prelim refactoring