From: | John Naylor <johncnaylorls(at)gmail(dot)com> |
---|---|
To: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
Cc: | Ants Aasma <ants(at)cybertec(dot)at>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: add AVX2 support to simd.h |
Date: | 2024-03-20 06:57:54 |
Message-ID: | CANWCAZbphuJTDjusRBGWk1R-z8Z-kvjMjsC5X4A6rjTN54MOFw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, Mar 19, 2024 at 11:30 PM Nathan Bossart
<nathandbossart(at)gmail(dot)com> wrote:
> > Sounds similar in principle, but it looks really complicated. I don't
> > think the additional loops and branches are a good way to go, either
> > for readability or for branch prediction. My sketch has one branch for
> > which loop to do, and then performs only one loop. Let's do the
> > simplest thing that could work. (I think we might need a helper
> > function to do the block, but the rest should be easy)
>
> I tried to trim some of the branches, and came up with the attached patch.
> I don't think this is exactly what you were suggesting, but I think it's
> relatively close. My testing showed decent benefits from using 2 vectors
> when there aren't enough elements for 4, so I've tried to keep that part
> intact.
I would caution against that if the benchmark is repeatedly running
against a static number of elements, because the branch predictor will
be right all the time (except maybe when it exits a loop, not sure).
We probably don't need to go to the trouble to construct a benchmark
with some added randomness, but we have be careful not to overfit what
the test is actually measuring.
From | Date | Subject | |
---|---|---|---|
Next Message | Heikki Linnakangas | 2024-03-20 07:16:13 | Re: Refactoring backend fork+exec code |
Previous Message | jian he | 2024-03-20 06:51:48 | Re: remaining sql/json patches |