From: | Ranier Vilela <ranier(dot)vf(at)gmail(dot)com> |
---|---|
To: | Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com> |
Cc: | Michael Paquier <michael(at)paquier(dot)xyz>, David Rowley <dgrowleyml(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: define pg_structiszero(addr, s, r) |
Date: | 2024-11-15 12:54:33 |
Message-ID: | CAEudQAr3TTTs7et_ke8V0JhggmPnYOJdwZPYuZePF7MPjcn-yQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Em sex., 15 de nov. de 2024 às 03:46, Bertrand Drouvot <
bertranddrouvot(dot)pg(at)gmail(dot)com> escreveu:
> Hi,
>
> On Fri, Nov 15, 2024 at 09:30:25AM +0900, Michael Paquier wrote:
> > On Thu, Nov 14, 2024 at 12:33:20PM +0000, Bertrand Drouvot wrote:
> > Anyway, as you say, the
> > portability of v12 is OK even for sizeof(size_t) == 4 because we don't
> > rely on any hardcoded values, and this patch does what it should in
> > this case (double-checked myself manually for the three cases with
> > -m32).
>
> Yeah, thanks for the testing!
>
> > > What would be unsafe on 32-bit would be to read up to 32 bytes while
> len < 32
> > > and that can not happen.
> > >
> > > As mentioned up-thread the comments are wrong on 32-bit, indeed they
> must be read
> > > as:
> > >
> > > Case 1: len < 4 bytes
> > > Case 2: len in the 4-31 bytes range
> > > Case 3: len >= 32 bytes
> >
> > This part could be indeed better than what's proposed in v12, so I
> > would recommend to use sizeof(size_t) a bit more consistently rather
> > than have the reader guess that.
>
> Makes sense even if that looks "more difficult" to read.
>
> > Some comments feel duplicated, as well, like the "no risk" mentions,
> > which are clear enough based on the description and the limitations of
> > the previous cases. I'd like to suggest a few tweaks, making the
> > comments more flexible. See 0003 that applies on top of your latest
> > patch set, reattaching v12 again.
>
> Thanks! Applied on v13 attached, except for things like:
>
> "
> - /* Compare bytes until the pointer "p" is aligned */
> + /* Compare bytes until the pointer "p" is aligned. */
> "
>
> which is adding a "." at the end of single line comments (as the few
> already
> part of this file don't do so).
>
There is a tiny typo with V13.
+ /* "len" in the [sizeof(size_t) * 8, inf] range */
But, I'm not sure if I'm still doing something wrong.
If so, forgive me for the noise.
In the v3_allzeros_check.c attached,
the results is:
cc -march=native -O2 v3_allzeros_check.c -o v3_allzeros_check ;
./v3_allzeros_check
pagebytes[BLCKSZ-2]=1
byte per byte: is_allzeros
size_t: is_allzeros
SIMD v10: is_allzeros
SIMD v11: is_allzeros
SIMD v12: is_allzeros
SIMD v14: is_allzeros
Of course I expected "not is_allzeros".
Anyway, I made another attempt to optimize a bit more, I don't know if it's
safe though.
results with v3_allzeros_small.c attached:
WITH 8192 BLCKSZ
Ubuntu 22.04 64 bits
gcc -march=native -O2 v3_allzeros_small.c -o v3_allzeros_small ;
./v3_allzeros_small
byte per byte: done in 5027744 nanoseconds
size_t: done in 382521 nanoseconds (13.1437 times faster than byte per byte)
SIMD v10: done in 157777 nanoseconds (31.8661 times faster than byte per
byte)
SIMD v11: done in 159696 nanoseconds (31.4832 times faster than byte per
byte)
SIMD v12: done in 168117 nanoseconds (29.9062 times faster than byte per
byte)
SIMD v14: done in 21008 nanoseconds (239.325 times faster than byte per
byte)
best regards,
Ranier Vilela
Attachment | Content-Type | Size |
---|---|---|
v3_allzeros_small.c | text/x-csrc | 14.3 KB |
v3_allzeros_check.c | text/x-csrc | 13.0 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Mark Hill | 2024-11-15 13:01:36 | RE: FW: Building Postgres 17.0 with meson |
Previous Message | Heikki Linnakangas | 2024-11-15 12:39:15 | Re: Interrupts vs signals |