Re: Parallel CREATE INDEX for BRIN indexes

From: Tomas Vondra <tomas(at)vondra(dot)me>
To: Peter Eisentraut <peter(at)eisentraut(dot)org>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Parallel CREATE INDEX for BRIN indexes
Date: 2024-08-16 09:22:45
Message-ID: 618bc573-a0df-4ea9-bb71-aecff104c5f5@vondra.me
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 8/15/24 15:48, Peter Eisentraut wrote:
> On 13.04.24 23:04, Tomas Vondra wrote:
>>>> While preparing a differential code coverage report between 16 and
>>>> HEAD, one
>>>> thing that stands out is the parallel brin build code. Neither on
>>>> coverage.postgresql.org nor locally is that code reached during our
>>>> tests.
>>>>
>>>
>>> Thanks for pointing this out, it's definitely something that I need to
>>> improve (admittedly, should have been part of the patch). I'll also look
>>> into eliminating the difference between BTREE and BRIN parallel builds,
>>> mentioned in my last message in this thread.
>>>
>>
>> Here's a couple patches adding a test for the parallel CREATE INDEX with
>> BRIN. The actual test is 0003/0004 - I added the test to pageinspect,
>> because that allows cross-checking the index to one built without
>> parallelism, which I think is better than just doing CREATE INDEX
>> without properly testing it produces correct results.
>
> These pageinspect tests added a new use of the md5() function.  We got
> rid of those in the tests for PG17.  You could write the test case with
> something like
>
>  SELECT (CASE WHEN (mod(i,231) = 0) OR (i BETWEEN 3500 AND 4000) THEN
> NULL ELSE i END),
> -       (CASE WHEN (mod(i,233) = 0) OR (i BETWEEN 3750 AND 4250) THEN
> NULL ELSE md5(i::text) END),
> +       (CASE WHEN (mod(i,233) = 0) OR (i BETWEEN 3750 AND 4250) THEN
> NULL ELSE encode(sha256(i::text::bytea), 'hex') END),
>         (CASE WHEN (mod(i,233) = 0) OR (i BETWEEN 3850 AND 4500) THEN
> NULL ELSE (i/100) + mod(i,8) END)
>
> But this changes the test output slightly and I'm not sure if this gives
> you the data distribution that you need for you test.  Could your check
> this please?
>

I think this is fine. The output only changes because sha256 produces
longer values than md5, so that the summaries are longer the index gets
a page longer. AFAIK that has no impact on the test.

regards

--
Tomas Vondra

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message shveta malik 2024-08-16 09:24:41 Re: Conflict detection and logging in logical replication
Previous Message vignesh C 2024-08-16 09:17:39 Re: Pgoutput not capturing the generated columns