Re: Parallel CREATE INDEX for GIN indexes

From: Andy Fan <zhihuifan1213(at)163(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Parallel CREATE INDEX for GIN indexes
Date: 2024-05-13 08:19:43
Message-ID: 87y18ektdn.fsf@163.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> writes:

>>> 7) v20240502-0007-Detect-wrap-around-in-parallel-callback.patch
>>>
>>> There's one more efficiency problem - the parallel scans are required to
>>> be synchronized, i.e. the scan may start half-way through the table, and
>>> then wrap around. Which however means the TID list will have a very wide
>>> range of TID values, essentially the min and max of for the key.
>>
>> I have two questions here and both of them are generall gin index questions
>> rather than the patch here.
>>
>> 1. What does the "wrap around" mean in the "the scan may start half-way
>> through the table, and then wrap around". Searching "wrap" in
>> gin/README gets nothing.
>>
>
> The "wrap around" is about the scan used to read data from the table
> when building the index. A "sync scan" may start e.g. at TID (1000,0)
> and read till the end of the table, and then wraps and returns the
> remaining part at the beginning of the table for blocks 0-999.
>
> This means the callback would not see a monotonically increasing
> sequence of TIDs.
>
> Which is why the serial build disables sync scans, allowing simply
> appending values to the sorted list, and even with regular flushes of
> data into the index we can simply append data to the posting lists.

Thanks for the hints, I know the sync strategy comes from syncscan.c
now.

>>> Without 0006 this would cause frequent failures of the index build, with
>>> the error I already mentioned:
>>>
>>> ERROR: could not split GIN page; all old items didn't fit

>> 2. I can't understand the below error.
>>
>>> ERROR: could not split GIN page; all old items didn't fit

> if (!append || ItemPointerCompare(&maxOldItem, &remaining) >= 0)
> elog(ERROR, "could not split GIN page; all old items didn't fit");
>
> It can fail simply because of the !append part.

Got it, Thanks!

>> If we split the blocks among worker 1-block by 1-block, we will have a
>> serious issue like here. If we can have N-block by N-block, and N-block
>> is somehow fill the work_mem which makes the dedicated temp file, we
>> can make things much better, can we?

> I don't understand the question. The blocks are distributed to workers
> by the parallel table scan, and it certainly does not do that block by
> block. But even it it did, that's not a problem for this code.

OK, I get ParallelBlockTableScanWorkerData.phsw_chunk_size is designed
for this.

> The problem is that if the scan wraps around, then one of the TID lists
> for a given worker will have the min TID and max TID, so it will overlap
> with every other TID list for the same key in that worker. And when the
> worker does the merging, this list will force a "full" merge sort for
> all TID lists (for that key), which is very expensive.

OK.

Thanks for all the answers, they are pretty instructive!

--
Best Regards
Andy Fan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2024-05-13 08:34:48 Re: [PATCH] Fix bug when calling strncmp in check_authmethod_valid
Previous Message Michael Paquier 2024-05-13 07:59:59 Re: race condition in pg_class