From: | Tomas Vondra <tomas(at)vondra(dot)me> |
---|---|
To: | Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com> |
Cc: | Kirill Reshke <reshkekirill(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Parallel CREATE INDEX for GIN indexes |
Date: | 2025-02-16 03:47:10 |
Message-ID: | d93e3428-45a1-46e5-8a89-41480257c169@vondra.me |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
Attached is a cleaned up version of the patch series, squashed into
fewer patches as discussed. I also went through all the comments, and
removed/updated some obsolete ones. I also updated the commit messages,
it'd be nice if someone could read through those, to make sure it's
clear enough.
While cleaning the comments, I realized there's a couple remaining XXX
and FIXME comments, with some valid open questions.
1) There are two places that explicitly zero memory, suggesting it's
because of padding causing issues in valgrind (in tuplesort). I need to
check if that's still true, but I wonder what do the other tuplesort
variants write stuff without tripping valgrind. Maybe the GinTuple is
too unique.
2) ginBuildCallbackParallel says this about memory limits:
* XXX It might seem this should set the memory limit to 32MB, same as
* what plan_create_index_workers() uses to calculate the number of
* parallel workers, but that's the limit for tuplesort. So it seems
* better to keep using work_mem here.
*
* XXX But maybe we should calculate this as a per-worker fraction of
* maintenance_work_mem. It's weird to use work_mem here, in a clearly
* maintenance command.
The function uses work_mem to limit the amount of memory used by each
worker, which seems a bit strange - it's a maintenance operation, so it
would be more appropriate to use maintenance_work_mem I guess.
I see the btree code also uses work_mem in some cases when building the
index, although that uses it to size the tuplesort. And here we have
both the tuplesorts (sized just like in nbtree code), but also the
buffer used to accumulate entries.
I wonder if maybe the right solution would be to use half the allowance
for tuplesort and half for the buffer. In the workers the allowance is
maintenance_work_mem / ginleader->nparticipanttuplesorts
while in the leader it's maintenance_work_mem. Opinions?
3) There's a XXX comment suggesting to use a separate memory context for
the GinBuffer, but I decided it doesn't seem really necessary. We're not
running any complex function or anything like that in this code, so I
don't see a huge benefit of a separate context.
I know the patch reworking this to use a single tuplesort actually adds
the memory context, maybe it's helpful for that patch. But for now I
don't see the point.
4) The patch saving 12B in the GinTuple also added this comment:
* XXX: Update description with new architecture
but I'm a bit unsure what exactly is meant but "architecture" or what
should I add a description for.
regards
--
Tomas Vondra
Attachment | Content-Type | Size |
---|---|---|
v20250216-0001-Allow-parallel-CREATE-INDEX-for-GIN-indexe.patch | text/x-patch | 68.8 KB |
v20250216-0002-Compress-TID-lists-when-writing-GIN-tuples.patch | text/x-patch | 8.2 KB |
v20250216-0003-Enforce-memory-limit-during-parallel-GIN-b.patch | text/x-patch | 12.3 KB |
v20250216-0004-Use-a-single-GIN-tuplesort.patch | text/x-patch | 32.1 KB |
v20250216-0005-WIP-parallel-inserts-into-GIN-index.patch | text/x-patch | 17.5 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2025-02-16 04:02:16 | Re: Parallel CREATE INDEX for GIN indexes |
Previous Message | Tomas Vondra | 2025-02-16 01:15:15 | Re: BitmapHeapScan streaming read user and prelim refactoring |