From: | Zhang Mingli <zmlpostgres(at)gmail(dot)com> |
---|---|
To: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: optimize lookups in snapshot [sub]xip arrays |
Date: | 2022-07-24 04:48:25 |
Message-ID: | 76ACEAF0-8050-41D5-B2F5-01164533BE99@gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi, all
>
> if (!snapshot->suboverflowed)
> {
> /* we have full data, so search subxip */
> - int32 j;
> -
> - for (j = 0; j < snapshot->subxcnt; j++)
> - {
> - if (TransactionIdEquals(xid, snapshot->subxip[j]))
> - return true;
> - }
> + if (XidInXip(xid, snapshot->subxip, snapshot->subxcnt,
> + &snapshot->subxiph))
> + return true;
>
> /* not there, fall through to search xip[] */
> }
If snaphost->suboverflowed is false then the subxcnt must be less than PGPROC_MAX_CACHED_SUBXIDS which is 64 now.
And we won’t use hash if the xcnt is less than XIP_HASH_MIN_ELEMENTS which is 128 currently during discussion.
So that, subxid’s hash table will never be used, right?
Regards,
Zhang Mingli
> On Jul 14, 2022, at 01:09, Nathan Bossart <nathandbossart(at)gmail(dot)com> wrote:
>
> Hi hackers,
>
> A few years ago, there was a proposal to create hash tables for long
> [sub]xip arrays in snapshots [0], but the thread seems to have fizzled out.
> I was curious whether this idea still showed measurable benefits, so I
> revamped the patch and ran the same test as before [1]. Here are the
> results for 60₋second runs on an r5d.24xlarge with the data directory on
> the local NVMe storage:
>
> writers HEAD patch diff
> ----------------------------
> 16 659 664 +1%
> 32 645 663 +3%
> 64 659 692 +5%
> 128 641 716 +12%
> 256 619 610 -1%
> 512 530 702 +32%
> 768 469 582 +24%
> 1000 367 577 +57%
>
> As before, the hash table approach seems to provide a decent benefit at
> higher client counts, so I felt it was worth reviving the idea.
>
> The attached patch has some key differences from the previous proposal.
> For example, the new patch uses simplehash instead of open-coding a new
> hash table. Also, I've bumped up the threshold for creating hash tables to
> 128 based on the results of my testing. The attached patch waits until a
> lookup of [sub]xip before generating the hash table, so we only need to
> allocate enough space for the current elements in the [sub]xip array, and
> we avoid allocating extra memory for workloads that do not need the hash
> tables. I'm slightly worried about increasing the number of memory
> allocations in this code path, but the results above seemed encouraging on
> that front.
>
> Thoughts?
>
> [0] https://postgr.es/m/35960b8af917e9268881cd8df3f88320%40postgrespro.ru
> [1] https://postgr.es/m/057a9a95-19d2-05f0-17e2-f46ff20e9b3e%402ndquadrant.com
>
> --
> Nathan Bossart
> Amazon Web Services: https://aws.amazon.com
> <v1-0001-Optimize-lookups-in-snapshot-transactions-in-prog.patch>
From | Date | Subject | |
---|---|---|---|
Next Message | Fabien COELHO | 2022-07-24 08:15:22 | Re: [PATCH] Introduce array_shuffle() and array_sample() |
Previous Message | Zhihong Yu | 2022-07-24 01:27:59 | Re: redacting password in SQL statement in server log |