From: | "Chiranmoy(dot)Bhattacharya(at)fujitsu(dot)com" <Chiranmoy(dot)Bhattacharya(at)fujitsu(dot)com> |
---|---|
To: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
Cc: | "Malladi, Rama" <ramamalladi(at)hotmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, "Ragesh(dot)Hajela(at)fujitsu(dot)com" <Ragesh(dot)Hajela(at)fujitsu(dot)com>, Salvatore Dipietro <dipiets(at)amazon(dot)com>, "Devanga(dot)Susmitha(at)fujitsu(dot)com" <Devanga(dot)Susmitha(at)fujitsu(dot)com> |
Subject: | Re: [PATCH] SVE popcount support |
Date: | 2025-02-19 09:31:50 |
Message-ID: | OSBPR01MB266482AB53FDF8638015607D97C52@OSBPR01MB2664.jpnprd01.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> Hm. Any idea why that is? I wonder if the compiler isn't using as many
> SVE registers as it could for this.
Not sure, we tried forcing loop unrolling using the below line in the MakeFile
but the results are the same.
pg_popcount_sve.o: CFLAGS += ${CFLAGS_UNROLL_LOOPS} -march=native
> I've also noticed that the latest patch doesn't compile on my M3 macOS
> machine. After a quick glance, I think the problem is that the
> TRY_POPCNT_FAST macro is set, so it's trying to compile the assembly
> versions.
Fixed, we tried using the existing "choose" logic guarded by TRY_POPCNT_FAST.
The latest patch bypasses TRY_POPCNT_FAST by having a separate choose logic
for aarch64.
-Chiranmoy
Attachment | Content-Type | Size |
---|---|---|
v5-0001-SVE-support-for-popcount-and-popcount-masked.patch | application/octet-stream | 14.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Alexander Korotkov | 2025-02-19 09:48:04 | Re: Improve statistics estimation considering GROUP-BY as a 'uniqueiser' |
Previous Message | Jelte Fennema-Nio | 2025-02-19 09:25:01 | Re: new commitfest transition guidance |