Re: [PATCH] Hex-coding optimizations using SVE on ARM.

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: John Naylor <johncnaylorls(at)gmail(dot)com>
Cc: Nathan Bossart <nathandbossart(at)gmail(dot)com>, "Chiranmoy(dot)Bhattacharya(at)fujitsu(dot)com" <Chiranmoy(dot)Bhattacharya(at)fujitsu(dot)com>, "Devanga(dot)Susmitha(at)fujitsu(dot)com" <Devanga(dot)Susmitha(at)fujitsu(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, "Ragesh(dot)Hajela(at)fujitsu(dot)com" <Ragesh(dot)Hajela(at)fujitsu(dot)com>
Subject: Re: [PATCH] Hex-coding optimizations using SVE on ARM.
Date: 2025-01-15 07:14:48
Message-ID: 1194263.1736925288@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

John Naylor <johncnaylorls(at)gmail(dot)com> writes:
> Okay, I added a comment. I also agree with Michael that my quick
> one-off was a bit hard to read so I've cleaned it up a bit. I plan to
> commit the attached by Friday, along with any bikeshedding that
> happens by then.

Couple of thoughts:

1. I was actually hoping for a comment on the constant's definition,
perhaps along the lines of

/*
* The hex expansion of each possible byte value (two chars per value).
*/

2. Since "src" is defined as "const char *", I'm pretty sure that
pickier compilers will complain that

+ unsigned char usrc = *((unsigned char *) src);

results in casting away const. Recommend

+ unsigned char usrc = *((const unsigned char *) src);

3. I really wonder if

+ memcpy(dst, &hextbl[2 * usrc], 2);

is faster than copying the two bytes manually, along the lines of

+ *dst++ = hextbl[2 * usrc];
+ *dst++ = hextbl[2 * usrc + 1];

Compilers that inline memcpy() may arrive at the same machine code,
but why rely on the compiler to make that optimization? If the
compiler fails to do so, an out-of-line memcpy() call will surely
be a loser.

A variant could be

+ const char *hexptr = &hextbl[2 * usrc];
+ *dst++ = hexptr[0];
+ *dst++ = hexptr[1];

but this supposes that the compiler fails to see the common
subexpression in the other formulation, which I believe
most modern compilers will see.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2025-01-15 07:28:13 Re: Re: proposal: schema variables
Previous Message vignesh C 2025-01-15 07:11:39 Re: Virtual generated columns