From: | Tomas Vondra <tomas(at)vondra(dot)me> |
---|---|
To: | Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com> |
Cc: | Kirill Reshke <reshkekirill(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Parallel CREATE INDEX for GIN indexes |
Date: | 2025-02-25 15:49:03 |
Message-ID: | 08b57e98-2fd8-4372-bd1f-b15a010f171b@vondra.me |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
One more patch version / rebase. I've been planning to get 0001
committed, but I realized there's one more loose end - progress reporting.
I could have committed it without it, I guess, but Matthias actually
mentioned this a couple days ago so I took a stab at it. The build goes
through these 5 build stages (on top of "INITIALIZE"):
PROGRESS_GIN_PHASE_INDEXBUILD_TABLESCAN
PROGRESS_GIN_PHASE_PERFORMSORT_1
PROGRESS_GIN_PHASE_MERGE_1
PROGRESS_GIN_PHASE_PERFORMSORT_2
PROGRESS_GIN_PHASE_MERGE_2
The phases up to PROGRESS_GIN_PHASE_MERGE_1 happen in workers, i.e. it
ends with workers feeding the sorted/merged data into the shared
tuplesort. The last two phases are in the leader, which merges the data
and actually inserts it into the GIN index.
The "parallel" part has the blocks_done/blocks_total showing progress,
per the parallel scan. The "leader" phases use tuples_done/tuples_total,
where "tuple" is the GIN tuple produced by workers (each worker reports
the number of "tuples" it writes into the shared tuplesort, the leader
then tracks how many it processed).
I think this works pretty nicely. I'm not entirely sure we need all the
phases, maybe it'd be fine to have the sort+merge as a single phase? Or
maybe there should be one extra "sort" phase? Workers do two sorts,
first on their "private" tuplesort, then on the "shared" one.
What annoys me a little bit is that we only see those stages if the
leader participates as a worker. With parallel_leader_participation=off
none of this is visible anyway (we still see the blocks from the scan).
regards
--
Tomas Vondra
Attachment | Content-Type | Size |
---|---|---|
v20250225-0001-Allow-parallel-CREATE-INDEX-for-GIN-indexe.patch | text/x-patch | 68.8 KB |
v20250225-0002-cleanup.patch | text/x-patch | 5.5 KB |
v20250225-0003-progress.patch | text/x-patch | 8.3 KB |
v20250225-0004-Compress-TID-lists-when-writing-GIN-tuples.patch | text/x-patch | 8.2 KB |
v20250225-0005-Enforce-memory-limit-during-parallel-GIN-b.patch | text/x-patch | 12.3 KB |
v20250225-0006-Use-a-single-GIN-tuplesort.patch | text/x-patch | 32.2 KB |
v20250225-0007-WIP-parallel-inserts-into-GIN-index.patch | text/x-patch | 18.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Ramanathan | 2025-02-25 15:56:13 | Re: Proposal - Reduce lock during first phase of VACUUM TRUNCATE from ACCESS EXCLUSIVE to EXCLUSIVE |
Previous Message | Magnus Hagander | 2025-02-25 15:42:37 | Re: Adding extension default version to \dx |