Re: [PATCH] Hex-coding optimizations using SVE on ARM.

From: "Chiranmoy(dot)Bhattacharya(at)fujitsu(dot)com" <Chiranmoy(dot)Bhattacharya(at)fujitsu(dot)com>
To: Nathan Bossart <nathandbossart(at)gmail(dot)com>
Cc: "Devanga(dot)Susmitha(at)fujitsu(dot)com" <Devanga(dot)Susmitha(at)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "Ragesh(dot)Hajela(at)fujitsu(dot)com" <Ragesh(dot)Hajela(at)fujitsu(dot)com>
Subject: Re: [PATCH] Hex-coding optimizations using SVE on ARM.
Date: 2025-01-13 15:48:49
Message-ID: TY2PR01MB2667B294A9D2556C05BE02A0971F2@TY2PR01MB2667.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 10, 2025 at 09:38:14AM -0600, Nathan Bossart wrote:
> Do you mean that the auto-vectorization worked and you observed no
> performance improvement, or the auto-vectorization had no effect on the
> code generated?

Auto-vectorization is working now with the following addition on Graviton 3 (m7g.4xlarge) with GCC 11.4, and the results match yours. Previously, auto-vectorization had no effect because we missed the -march=native option.

      encode.o: CFLAGS += ${CFLAGS_VECTORIZE} -march=native

There is a 30% improvement using auto-vectorization.

buf | default | auto_vec | SVE
--------+-------+--------+-------
16 | 16 | 12 | 8
64 | 58 | 40 | 9
256 | 223 | 152 | 18
1024 | 934 | 613 | 54
4096 | 3533 | 2430 | 202
16384 | 14081 | 9831 | 800
65536 | 56374 | 38702 | 3202

Auto-vectorization had no effect on hex_decode due to the presence of control flow.

-----
Here is a comment snippet from src/include/port/simd.h

"While Neon support is technically optional for aarch64, it appears that all available 64-bit hardware does have it."

Currently, it is assumed that all aarch64 machine support NEON, but for newer advanced SIMD like SVE (and AVX512 for x86) this assumption may not hold. We need a runtime check to be sure.. Using src/include/port/simd.h to abstract away these advanced SIMD implementations may be difficult.

We will update the thread once a solution is found.

-----
Chiranmoy

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bertrand Drouvot 2025-01-13 15:55:50 Re: POC: track vacuum/analyze cumulative time per relation
Previous Message Malladi, Rama 2025-01-13 15:28:30 Re: [PATCH] SVE popcount support