From: | John Naylor <john(dot)naylor(at)enterprisedb(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: RFC: Improve CPU cache locality of syscache searches |
Date: | 2021-08-05 16:27:49 |
Message-ID: | CAFBsxsGkBtEVjjMLZcRQqKxUCZBauoiLBPmH3X-EDSSWd__Yug@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Aug 4, 2021 at 3:44 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2021-08-04 12:39:29 -0400, John Naylor wrote:
> > typedef struct cc_bucket
> > {
> > uint32 hashes[4];
> > catctup *ct[4];
> > dlist_head;
> > };
>
> I'm not convinced that the above the right idea though. Even if the hash
> matches, you're still going to need to fetch at least catctup->keys[0]
from
> a separate cacheline to be able to return the cache entry.
I see your point. It doesn't make sense to inline only part of the
information needed.
> struct cc_bucket_1
> {
> uint32 hashes[3]; // 12
> // 4 bytes alignment padding
> Datum key0s[3]; // 24
> catctup *ct[3]; // 24
> // cacheline boundary
> dlist_head conflicts; // 16
> };
>
> would be better for 1 key values?
>
> It's obviously annoying to need different bucket types for different key
> counts, but given how much 3 unused key Datums waste, it seems worth
paying
> for?
Yeah, it's annoying, but it does make a big difference to keep out unused
Datums:
keys cachelines
3 values 4 values
1 1 1/4 1 1/2
2 1 5/8 2
3 2 2 1/2
4 2 3/8 3
Or, looking at it another way, limiting the bucket size to 2 cachelines, we
can fit:
keys values
1 5
2 4
3 3
4 2
Although I'm guessing inlining just two values in the 4-key case wouldn't
buy much.
> If we stuffed four values into one bucket we could potentially SIMD the
hash
> and Datum comparisons ;)
;-) That's an interesting future direction to consider when we support
building with x86-64-v2. It'd be pretty easy to compare a vector of hashes
and quickly get the array index for the key comparisons (ignoring for the
moment how to handle the rare case of multiple identical hashes).
However, we currently don't memcmp() the Datums and instead call an
"eqfast" function, so I don't see how that part would work in a vector
setting.
--
John Naylor
EDB: http://www.enterprisedb.com
From | Date | Subject | |
---|---|---|---|
Next Message | Platon Pronko | 2021-08-05 16:48:13 | Re: very long record lines in expanded psql output |
Previous Message | Andrew Dunstan | 2021-08-05 16:26:42 | Re: very long record lines in expanded psql output |