From: | Andy Fan <zhihuifan1213(at)163(dot)com> |
---|---|
To: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
Cc: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: Parallel CREATE INDEX for GIN indexes |
Date: | 2024-05-13 08:19:43 |
Message-ID: | 87y18ektdn.fsf@163.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> writes:
>>> 7) v20240502-0007-Detect-wrap-around-in-parallel-callback.patch
>>>
>>> There's one more efficiency problem - the parallel scans are required to
>>> be synchronized, i.e. the scan may start half-way through the table, and
>>> then wrap around. Which however means the TID list will have a very wide
>>> range of TID values, essentially the min and max of for the key.
>>
>> I have two questions here and both of them are generall gin index questions
>> rather than the patch here.
>>
>> 1. What does the "wrap around" mean in the "the scan may start half-way
>> through the table, and then wrap around". Searching "wrap" in
>> gin/README gets nothing.
>>
>
> The "wrap around" is about the scan used to read data from the table
> when building the index. A "sync scan" may start e.g. at TID (1000,0)
> and read till the end of the table, and then wraps and returns the
> remaining part at the beginning of the table for blocks 0-999.
>
> This means the callback would not see a monotonically increasing
> sequence of TIDs.
>
> Which is why the serial build disables sync scans, allowing simply
> appending values to the sorted list, and even with regular flushes of
> data into the index we can simply append data to the posting lists.
Thanks for the hints, I know the sync strategy comes from syncscan.c
now.
>>> Without 0006 this would cause frequent failures of the index build, with
>>> the error I already mentioned:
>>>
>>> ERROR: could not split GIN page; all old items didn't fit
>> 2. I can't understand the below error.
>>
>>> ERROR: could not split GIN page; all old items didn't fit
> if (!append || ItemPointerCompare(&maxOldItem, &remaining) >= 0)
> elog(ERROR, "could not split GIN page; all old items didn't fit");
>
> It can fail simply because of the !append part.
Got it, Thanks!
>> If we split the blocks among worker 1-block by 1-block, we will have a
>> serious issue like here. If we can have N-block by N-block, and N-block
>> is somehow fill the work_mem which makes the dedicated temp file, we
>> can make things much better, can we?
> I don't understand the question. The blocks are distributed to workers
> by the parallel table scan, and it certainly does not do that block by
> block. But even it it did, that's not a problem for this code.
OK, I get ParallelBlockTableScanWorkerData.phsw_chunk_size is designed
for this.
> The problem is that if the scan wraps around, then one of the TID lists
> for a given worker will have the min TID and max TID, so it will overlap
> with every other TID list for the same key in that worker. And when the
> worker does the merging, this list will force a "full" merge sort for
> all TID lists (for that key), which is very expensive.
OK.
Thanks for all the answers, they are pretty instructive!
--
Best Regards
Andy Fan
From | Date | Subject | |
---|---|---|---|
Next Message | Daniel Gustafsson | 2024-05-13 08:34:48 | Re: [PATCH] Fix bug when calling strncmp in check_authmethod_valid |
Previous Message | Michael Paquier | 2024-05-13 07:59:59 | Re: race condition in pg_class |