Re: define pg_structiszero(addr, s, r)

From: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, Ranier Vilela <ranier(dot)vf(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: define pg_structiszero(addr, s, r)
Date: 2024-11-12 10:56:20
Message-ID: ZzM0VNK161K95ZHn@ip-10-97-1-34.eu-west-3.compute.internal
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On Tue, Nov 12, 2024 at 03:56:13PM +0900, Michael Paquier wrote:
> On Tue, Nov 12, 2024 at 06:09:04AM +0000, Bertrand Drouvot wrote:
> > I think that the 64b len check done in v11 is mandatory for safety reasons.
> >
> > The loop above reads 64 bytes at once, so would read beyond the memory area bounds
> > if len < 64: That could cause crash or read invalid data.
>
> Sorry, I was not following your argument. You're right that we need
> something else here. However..
>
> + /*
> + * For len < 64, compare byte per byte to ensure we'll not read beyond the
> + * memory area.
> + */
> + if (len < sizeof(size_t) * 8)
> + {
> + while (p < end)
> + {
> + if (*p++ != 0)
> + return false;
> + }
> + return true;
> + }
> +
> + /* Compare bytes until the pointer "p" is aligned */
> + while (((uintptr_t) p & (sizeof(size_t) - 1)) != 0)
> + {
> + if (p == end)
> + return true;
> +
> + if (*p++ != 0)
> + return false;
> + }
> +
>
> Still, this is not optimal, based on what's been discussed upthread.
> The byte-per-byte check is more expensive than the size_t check,

I think that depends of the memory area size. If the size is small enough then the
byte per byte can be good enough.

For example, with the allzeros_small.c attached:

== with BLCKSZ 32

$ /usr/local/gcc-14.1.0/bin/gcc-14.1.0 -march=native -O2 allzeros_small.c -o allzeros_small ; ./allzeros_small
byte per byte: done in 22528 nanoseconds
size_t: done in 6949 nanoseconds (3.24191 times faster than byte per byte)
SIMD v10: done in 7562 nanoseconds (2.97911 times faster than byte per byte)
SIMD v11: done in 22096 nanoseconds (1.01955 times faster than byte per byte)

== with BLCKSZ 63

$ /usr/local/gcc-14.1.0/bin/gcc-14.1.0 -march=native -O2 allzeros_small.c -o allzeros_small ; ./allzeros_small
byte per byte: done in 29246 nanoseconds
size_t: done in 10555 nanoseconds (2.77082 times faster than byte per byte)
SIMD v10: done in 11220 nanoseconds (2.6066 times faster than byte per byte)
SIMD v11: done in 29126 nanoseconds (1.00412 times faster than byte per byte)

Obviously v11 is about the same time as "byte per byte" but we can see that the
size_t or v10 improvment is not that much for small size.

While for larger size:

== with BLCKSZ 256

$ /usr/local/gcc-14.1.0/bin/gcc-14.1.0 -march=native -O2 allzeros_small.c -o allzeros_small ; ./allzeros_small
byte per byte: done in 102703 nanoseconds
size_t: done in 15381 nanoseconds (6.67726 times faster than byte per byte)
SIMD v10: done in 7241 nanoseconds (14.1835 times faster than byte per byte)
SIMD v11: done in 7899 nanoseconds (13.002 times faster than byte per byte)

== with BLCKSZ 8192

$ /usr/local/gcc-14.1.0/bin/gcc-14.1.0 -march=native -O2 allzeros_small.c -o allzeros_small ; ./allzeros_small
byte per byte: done in 2993458 nanoseconds
size_t: done in 436650 nanoseconds (6.85551 times faster than byte per byte)
SIMD v10: done in 136413 nanoseconds (21.9441 times faster than byte per byte)
SIMD v11: done in 155474 nanoseconds (19.2538 times faster than byte per byte)

It's sensitive improvment.

> shouldn't you make sure that you stack some size_t checks if dealing
> with something smaller than 64 bytes?

Based on the above I've the feeling that doing byte per byte comparison for
small size only (< 64b) is good enough. I'm not sure that adding extra complexity
for small sizes is worth it.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
allzeros_small.c text/x-csrc 7.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2024-11-12 11:21:08 Re: Parallel heap vacuum
Previous Message Torsten Förtsch 2024-11-12 10:54:21 Re: Allowing pg_recvlogical to create temporary replication slots