From: | Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com> |
---|---|
To: | Ranier Vilela <ranier(dot)vf(at)gmail(dot)com> |
Cc: | Michael Paquier <michael(at)paquier(dot)xyz>, David Rowley <dgrowleyml(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: define pg_structiszero(addr, s, r) |
Date: | 2024-11-17 05:36:27 |
Message-ID: | ZzmA24RVVBUhRVg8@ip-10-97-1-34.eu-west-3.compute.internal |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On Sat, Nov 16, 2024 at 11:42:54AM -0300, Ranier Vilela wrote:
> > Em sex., 15 de nov. de 2024 às 11:43, Bertrand Drouvot <
> > bertranddrouvot(dot)pg(at)gmail(dot)com> escreveu:
> >
> >> while that should be:
> >>
> >> "
> >> static inline bool
> >> pg_memory_is_all_zeros_simd(const void *p, const void *end)
> >>
> > What I'm trying here, obviously, is a hack.
> > If it works, and the compiler accepts it, it's ok for me.
> >
> > If this hack is safe and correct, I think that 204 times faster,
> > it is very good, for a block size 8192.
The "hack" is not correct, because it's doing:
"
static inline bool
all_zeros_simd(const size_t *p, const size_t * end)
{
for (; p < (end - sizeof(size_t) * 7); p += sizeof(size_t) * 8)
{
if ((((size_t *) p)[0] != 0) | (((size_t *) p)[1] != 0) |
(((size_t *) p)[2] != 0) | (((size_t *) p)[3] != 0) |
(((size_t *) p)[4] != 0) | (((size_t *) p)[5] != 0) |
(((size_t *) p)[6] != 0) | (((size_t *) p)[7] != 0))
return false;
}
.
.
"
"p += sizeof(size_t) * 8" advances by 64 elements. But those elements are "size_t"
elements (since you're using size_t pointers as the function arguments).
Then instead of advancing by 64 bytes, you know advance by 512 bytes. But
you only check 64 bytes per iteration -> you're missing 448 bytes to check
per iteration.
We can "visualize" this by adding a few output messages like:
"
static inline bool
all_zeros_simd(const size_t *p, const size_t * end)
{
for (; p < (end - sizeof(size_t) * 7); p += sizeof(size_t) * 8)
{
printf("Current p: %p\n", (void*)p);
printf("Checking elements:\n");
printf("[0]: %p = %zu\n", (void*)&((size_t *)p)[0], ((size_t *)p)[0]);
printf("[1]: %p = %zu\n", (void*)&((size_t *)p)[1], ((size_t *)p)[1]);
printf("[2]: %p = %zu\n", (void*)&((size_t *)p)[2], ((size_t *)p)[2]);
printf("[3]: %p = %zu\n", (void*)&((size_t *)p)[3], ((size_t *)p)[3]);
printf("[4]: %p = %zu\n", (void*)&((size_t *)p)[4], ((size_t *)p)[4]);
printf("[5]: %p = %zu\n", (void*)&((size_t *)p)[5], ((size_t *)p)[5]);
printf("[6]: %p = %zu\n", (void*)&((size_t *)p)[6], ((size_t *)p)[6]);
printf("[7]: %p = %zu\n", (void*)&((size_t *)p)[7], ((size_t *)p)[7]);
const size_t *next_p = p + sizeof(size_t) * 8;
printf("Next p will be: %p (advance of %zu bytes)\n",
(void*)next_p,
(size_t)((char*)next_p - (char*)p));
.
.
.
"
Then we get things like:
"
Current p: 0x7fff2a93e500
Checking elements:
[0]: 0x7fff2a93e500 = 0
[1]: 0x7fff2a93e508 = 0
[2]: 0x7fff2a93e510 = 0
[3]: 0x7fff2a93e518 = 0
[4]: 0x7fff2a93e520 = 0
[5]: 0x7fff2a93e528 = 0
[6]: 0x7fff2a93e530 = 0
[7]: 0x7fff2a93e538 = 0
Next p will be: 0x7fff2a93e700 (advance of 512 bytes)
"
Meaning that you're checking 64 bytes per iteration (for example from 0x500 to
0x508 is 8 bytes) while advancing by 512 bytes: you're missing 448 bytes to check
per iteration.
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Turelinckx | 2024-11-17 17:28:13 | Re: RISC-V animals sporadically produce weird memory-related failures |
Previous Message | Pavel Stehule | 2024-11-17 04:53:17 | Re: proposal: schema variables |