From: | John Naylor <johncnaylorls(at)gmail(dot)com> |
---|---|
To: | Xiang Gao <Xiang(dot)Gao(at)arm(dot)com> |
Cc: | "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Question about the Implementation of vector32_is_highbit_set on ARM |
Date: | 2023-11-20 09:05:43 |
Message-ID: | CANWCAZZj1Vn8Ee0JoZj-4ZvE48YrKYnFh-P9OcsUkeBrj62p6g@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Nov 8, 2023 at 2:44 PM Xiang Gao <Xiang(dot)Gao(at)arm(dot)com> wrote:
> * function. We could instead adopt the behavior of Arm's vmaxvq_u32(), i.e.
> * check each 32-bit element, but that would require an additional mask
> * operation on x86.
> */
> But I still don't understand why the vmaxvq_u32 intrinsic is not used on the arm platform.
The current use case expects all 1's or all 0's in a 32-bit lane. If
anyone tried using it for arbitrary values, vmaxvq_u32 could give a
different answer than on x86 using _mm_movemask_epi8, so I think
that's the origin of that comment. But it's still a maintenance hazard
as is, since x86 wouldn't work for arbitrary values. It seems the path
forward is to rename this function to vector32_is_any_lane_set(), as
in the attached (untested on Arm). That would allow each
implementation to use the most efficient path, whether it's by 8- or
32-bit lanes. If we someday needed to look at only the high bits, we
would need a new function that performed the necessary masking on x86.
It's possible this method could shave cycles on Arm in some 8-bit lane
cases where we don't actually care about the high bit specifically,
since the movemask equivalent is slow on that platform, but I haven't
looked yet.
Attachment | Content-Type | Size |
---|---|---|
v1-is_any_lane_set.patch | text/x-patch | 2.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Andrey M. Borodin | 2023-11-20 09:07:18 | Re: SLRU optimization - configurable buffer pool and partitioning the SLRU lock |
Previous Message | Andrei Lepikhov | 2023-11-20 08:52:08 | Re: POC, WIP: OR-clause support for indexes |