Re: Parallel CREATE INDEX for GIN indexes

From: Kirill Reshke <reshkekirill(at)gmail(dot)com>
To: Tomas Vondra <tomas(at)vondra(dot)me>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Parallel CREATE INDEX for GIN indexes
Date: 2024-11-24 18:04:38
Message-ID: CALdSSPhNjVzZXQQFiBWZBX8xO4JOu4PDc5XvjH=19JqzEmBiFQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, 8 Oct 2024 at 17:06, Tomas Vondra <tomas(at)vondra(dot)me> wrote:
>
> On 10/8/24 04:03, Michael Paquier wrote:
> >
> > _gin_parallel_build_main() is introduced in 0001. Please make sure to
> > pass down a query ID.
>
> Thanks for the ping. Here's an updated patch doing that, and also fixing
> a couple whitespace issues. No other changes, but I plan to get back to
> this patch soon - before the next CF.
>
>
> regards
>
> --
> Tomas Vondra

Hi! I was looking through this series of patches because thread of
GIN&GIST amcheck patch references it.

I have spotted this in gininsert.c:
1)
>/*
>* Store shared tuplesort-private state, for which we reserved space.
>* Then, initialize opaque state using tuplesort routine.
>*/
>sharedsort = (Sharedsort *) shm_toc_allocate(pcxt->toc, estsort);
>tuplesort_initialize_shared(sharedsort, scantuplesortstates,>
pcxt->seg);
>/*
>* Store shared tuplesort-private state, for which we reserved space.
>* Then, initialize opaque state using tuplesort routine.
>*/

Is it necessary to duplicate the entire comment?

And, while we are here, isn't it " initialize the opaque state "?

2) typo :
* the TID array, and returning false if it's too large (more thant work_mem,

3) in _gin_build_tuple:

....
else if (typlen == -2)
keylen = strlen(DatumGetPointer(key)) + 1;
else
elog(ERROR, "invalid typlen");

Maybe `elog(ERROR, "invalid typLen: %d", typLen); ` as in `datumGetSize`?

4) in _gin_compare_tuples:

> if ((a->category == GIN_CAT_NORM_KEY) &&
(b->category == GIN_CAT_NORM_KEY))

maybe just a->category == GIN_CAT_NORM_KEY? a->category is already
equal to b->category because of previous if statements.

5) In _gin_partition_sorted_data:

>char fname[128];
>sprintf(fname, "worker-%d", i);

Other places use MAXPGPATH in similar cases.

Also, code `sprintf(fname, "worker-%d",...);` duplicates. This might
be error-prone. Should we have a macro/inline function for this?

I will take another look later, maybe reporting real problems, not nit-picks.

--
Best regards,
Kirill Reshke

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2024-11-24 21:54:38 Re: Use or not record count on examples
Previous Message Tom Lane 2024-11-24 16:57:01 Re: Missing INFO on client_min_messages