From: | "Chiranmoy(dot)Bhattacharya(at)fujitsu(dot)com" <Chiranmoy(dot)Bhattacharya(at)fujitsu(dot)com> |
---|---|
To: | Nathan Bossart <nathandbossart(at)gmail(dot)com> |
Cc: | "Devanga(dot)Susmitha(at)fujitsu(dot)com" <Devanga(dot)Susmitha(at)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "Ragesh(dot)Hajela(at)fujitsu(dot)com" <Ragesh(dot)Hajela(at)fujitsu(dot)com> |
Subject: | Re: [PATCH] Hex-coding optimizations using SVE on ARM. |
Date: | 2025-01-13 15:48:49 |
Message-ID: | TY2PR01MB2667B294A9D2556C05BE02A0971F2@TY2PR01MB2667.jpnprd01.prod.outlook.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Jan 10, 2025 at 09:38:14AM -0600, Nathan Bossart wrote:
> Do you mean that the auto-vectorization worked and you observed no
> performance improvement, or the auto-vectorization had no effect on the
> code generated?
Auto-vectorization is working now with the following addition on Graviton 3 (m7g.4xlarge) with GCC 11.4, and the results match yours. Previously, auto-vectorization had no effect because we missed the -march=native option.
encode.o: CFLAGS += ${CFLAGS_VECTORIZE} -march=native
There is a 30% improvement using auto-vectorization.
buf | default | auto_vec | SVE
--------+-------+--------+-------
16 | 16 | 12 | 8
64 | 58 | 40 | 9
256 | 223 | 152 | 18
1024 | 934 | 613 | 54
4096 | 3533 | 2430 | 202
16384 | 14081 | 9831 | 800
65536 | 56374 | 38702 | 3202
Auto-vectorization had no effect on hex_decode due to the presence of control flow.
-----
Here is a comment snippet from src/include/port/simd.h
"While Neon support is technically optional for aarch64, it appears that all available 64-bit hardware does have it."
Currently, it is assumed that all aarch64 machine support NEON, but for newer advanced SIMD like SVE (and AVX512 for x86) this assumption may not hold. We need a runtime check to be sure.. Using src/include/port/simd.h to abstract away these advanced SIMD implementations may be difficult.
We will update the thread once a solution is found.
-----
Chiranmoy
From | Date | Subject | |
---|---|---|---|
Next Message | Bertrand Drouvot | 2025-01-13 15:55:50 | Re: POC: track vacuum/analyze cumulative time per relation |
Previous Message | Malladi, Rama | 2025-01-13 15:28:30 | Re: [PATCH] SVE popcount support |