Re: Parallel CREATE INDEX for GIN indexes

From: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
To: Tomas Vondra <tomas(at)vondra(dot)me>
Cc: Kirill Reshke <reshkekirill(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Parallel CREATE INDEX for GIN indexes
Date: 2025-01-06 19:13:43
Message-ID: CAEze2WjK=gReoEm2Kj-AvrH7x=+9AJX_Mb9bOR6g4n-c1JS_0w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 4 Jan 2025 at 17:58, Tomas Vondra <tomas(at)vondra(dot)me> wrote:
>
> On 11/24/24 19:04, Kirill Reshke wrote:
> > On Tue, 8 Oct 2024 at 17:06, Tomas Vondra <tomas(at)vondra(dot)me> wrote:
> >>
> >> On 10/8/24 04:03, Michael Paquier wrote:
> >>>
> >>> _gin_parallel_build_main() is introduced in 0001. Please make sure to
> >>> pass down a query ID.
> >>
> >> Thanks for the ping. Here's an updated patch doing that, and also fixing
> >> a couple whitespace issues. No other changes, but I plan to get back to
> >> this patch soon - before the next CF.
> >>
> >>
> >> regards
> >>
> >> --
> >> Tomas Vondra
> >
> > Hi! I was looking through this series of patches because thread of
> > GIN&GIST amcheck patch references it.
> >
> > I have spotted this in gininsert.c:
> > 1)
> >> /*
> >> * Store shared tuplesort-private state, for which we reserved space.
> >> * Then, initialize opaque state using tuplesort routine.
> >> */
> >> sharedsort = (Sharedsort *) shm_toc_allocate(pcxt->toc, estsort);
> >> tuplesort_initialize_shared(sharedsort, scantuplesortstates,>
> > pcxt->seg);
> >> /*
> >> * Store shared tuplesort-private state, for which we reserved space.
> >> * Then, initialize opaque state using tuplesort routine.
> >> */
> >
> > Is it necessary to duplicate the entire comment?
> >
>
> Yes, that's a copy-paste mistake. Removed the second comment.
>
> > And, while we are here, isn't it " initialize the opaque state "?
> >
>
> Not sure, this is copy pasted as-is from nbtree code.
>
> > 2) typo :
> > * the TID array, and returning false if it's too large (more thant work_mem,
> >
>
> Fixed.
>
> > 3) in _gin_build_tuple:
> >
> > ....
> > else if (typlen == -2)
> > keylen = strlen(DatumGetPointer(key)) + 1;
> > else
> > elog(ERROR, "invalid typlen");
> >
> >
> > Maybe `elog(ERROR, "invalid typLen: %d", typLen); ` as in `datumGetSize`?
> >
>
> Makes sense, I reworded it a little bit. But it's however supposed to be
> a can't happen condition.
>
> > 4) in _gin_compare_tuples:
> >
> >> if ((a->category == GIN_CAT_NORM_KEY) &&
> > (b->category == GIN_CAT_NORM_KEY))
> >
> > maybe just a->category == GIN_CAT_NORM_KEY? a->category is already
> > equal to b->category because of previous if statements.
> >
>
> True. I've simplified the condition.
>
> > 5) In _gin_partition_sorted_data:
> >
> >> char fname[128];
> >> sprintf(fname, "worker-%d", i);
> >
> > Other places use MAXPGPATH in similar cases.
> >
>
> OK, fixed the two places that format worker-%d.
>
> > Also, code `sprintf(fname, "worker-%d",...);` duplicates. This might
> > be error-prone. Should we have a macro/inline function for this?
> >
>
> Maybe. I think using a constant might be a good idea, but anything more
> complicated is not worth it. There's only two places using it, not very
> far apart.
>
> > I will take another look later, maybe reporting real problems, not nit-picks.
> >
>
> Thanks. Attached is a rebased patch series fixing those issues, and one
> issue I found in an AssertCheckGinBuffer, which was calling the other
> assert (AssertCheckItemPointers) even for empty buffers. I think this
> part might need some more work, so that it's clear what the various
> asserts assume (or rather to allow just calling AssertCheckGinBuffer
> everywhere, with some flags).

Thanks for the rebase.

> 0001

In gininsert.c, I think I'd prefer GinBuildShared over GinShared.
While current GIN infrastructure doesn't do parallel index scans (and
I can't think of an easy way to parallelize it) I think this it's
better to make clear that this isn't related to index scan.

> + * mutex protects all fields before heapdesc.

This comment is still inaccurate.

> + /* FIXME likely duplicate with indtuples */

I think this doesn't have to be duplicate, as we can distinguish
between number of heap tuples and the number of GIN (key, TID) pairs
loaded. This distinction doesn't really exist anywhere else, though,
so to expose this to users we may need changes in
pg_stat_progress_create_index.

While I haven't checked if that distinction is being made in the code,
I think it would be a useful distinction to have.

> GinBufferInit

This seems to depend on the btree operator classes to get sortsupport
functions, bypassing the GIN compare support function (support
function 1) and adding a dependency on the btree opclasses for
indexable types. This can cause "bad" ordering, or failure to build
the index when the parallel path is chosen and no default btree
opclass is defined for the type. I think it'd be better if we allowed
users to specify which sortsupport function to use, or at least use
the correct compare function when it's defined on the attribute's
operator class.

> include/access/gin_tuple.h
> + OffsetNumber attrnum; /* attnum of index key */

I think this would best be AttrNumber-typed? Looks like I didn't
notice or fix that in 0009.

> My plan is to eventually commit the first couple patches, possibly up
> 0007 or even 0009.

Sounds good. I'll see if I have some time to do some cleanup on my
patches (0008 and 0009), as they need some better polish on the
comments and commit messages.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2025-01-06 19:20:13 Re: Proposal: add new API to stringinfo
Previous Message Pavel Stehule 2025-01-06 19:10:12 Re: Re: proposal: schema variables