From: | Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com> |
---|---|
To: | Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Parallel CREATE INDEX for BRIN indexes |
Date: | 2023-11-28 15:39:33 |
Message-ID: | CAEze2WhCSKQDhP3nyARWH5Zf+B-5rtDteA7d4QSDa-BuSSkSPA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, 23 Nov 2023 at 14:35, Tomas Vondra
<tomas(dot)vondra(at)enterprisedb(dot)com> wrote:
> On 11/23/23 13:33, Matthias van de Meent wrote:
>> The union operator may leak (lots of) memory, so I think it makes
>> sense to keep a context around that can be reset after we've extracted
>> the merge result.
>>
>
> But does the current code actually achieve that? It does create a "brin
> union" context, but then it only does this:
>
> /* Use our own memory context to avoid retail pfree */
> cxt = AllocSetContextCreate(CurrentMemoryContext,
> "brin union",
> ALLOCSET_DEFAULT_SIZES);
> oldcxt = MemoryContextSwitchTo(cxt);
> db = brin_deform_tuple(bdesc, b, NULL);
> MemoryContextSwitchTo(oldcxt);
>
> Surely that does not limit the amount of memory used by the actual union
> functions in any way?
Oh, yes, of course. For some reason I thought that covered the calls
to the union operator function too, but it indeed only covers
deserialization. I do think it is still worthwhile to not do the
create/delete cycle, but won't hold the patch back for that.
>>> However, I don't think the number of union_tuples calls is likely to be
>>> very high, especially for large tables. Because we split the table into
>>> 2048 chunks, and then cap the chunk size by 8192. For large tables
>>> (where this matters) we're likely close to 8192.
>>
>> I agree that the merging part of the index creation is the last part,
>> and usually has no high impact on the total performance of the reindex
>> operation, but in memory-constrained environments releasing and then
>> requesting the same chunk of memory over and over again just isn't
>> great.
>
> OK, I'll take a look at the scratch context you suggested.
>
> My point however was we won't actually do that very often, because on
> large tables the BRIN ranges are likely smaller than the parallel scan
> chunk size, so few overlaps. OTOH if the table is small, or if the BRIN
> ranges are large, there'll be few of them.
That's true, so maybe I'm concerned about something that amounts to
only marginal gains.
I noticed that the v4 patch doesn't yet update the documentation in
indexam.sgml with am->amcanbuildparallel.
Once that is included and reviewed I think this will be ready, unless
you want to address any of my comments upthread (that I marked with
'not in this patch') in this patch.
Kind regards,
Matthias van de Meent
Neon (https://neon.tech)
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2023-11-28 15:42:27 | Re: SSL tests fail on OpenSSL v3.2.0 |
Previous Message | Tristan Partin | 2023-11-28 15:33:04 | Re: SSL tests fail on OpenSSL v3.2.0 |