From: | Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com> |
---|---|
To: | David Rowley <dgrowleyml(at)gmail(dot)com> |
Cc: | Michael Paquier <michael(at)paquier(dot)xyz>, Peter Smith <smithpb2250(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: define pg_structiszero(addr, s, r) |
Date: | 2024-11-01 06:33:51 |
Message-ID: | ZyR2Tzdmj9c6E/QB@ip-10-97-1-34.eu-west-3.compute.internal |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On Fri, Nov 01, 2024 at 05:45:44PM +1300, David Rowley wrote:
> On Thu, 31 Oct 2024 at 19:17, Michael Paquier <michael(at)paquier(dot)xyz> wrote:
> > Seems kind of OK seen from here. I am wondering if others more
> > comments about the name of the macro, its location, the fact that we
> > still have pagebytes in bufpage.c, etc.
>
> This change looks incorrect:
>
> @@ -126,18 +124,9 @@ PageIsVerifiedExtended(Page page, BlockNumber
> blkno, int flags)
> }
>
> /* Check all-zeroes case */
> - all_zeroes = true;
> pagebytes = (size_t *) page;
> - for (i = 0; i < (BLCKSZ / sizeof(size_t)); i++)
> - {
> - if (pagebytes[i] != 0)
> - {
> - all_zeroes = false;
> - break;
> - }
> - }
>
> - if (all_zeroes)
> + if (pg_memory_is_all_zeros(pagebytes, (BLCKSZ / sizeof(size_t))))
> return true;
>
> I think this should be passing BLCKSZ rather than (BLCKSZ /
> sizeof(size_t)), otherwise, it'll just check the first 1 kilobyte is
> zero rather than the entire page.
Thanks for looking at it!
Nice catch, indeed by using the new function we are changing the pointer arithmetic
here and then we should pass BLCKSZ instead.
> However, I've also performance concerns as if I look at the assembly
> of PageIsVerifiedExtended, I see the zero checking is now done 1 byte
> at a time:
>
> (gcc 11.4)
>
> leaq 1024(%rbx), %rdx <-- 1KB bug
> .p2align 4,,10
> .p2align 3
> .L60:
> cmpb $0, (%rax) <-- check 1 byte is zero.
> jne .L59
> addq $1, %rax <-- increment by 1 byte.
> cmpq %rax, %rdx <-- check if we've done 1024 bytes yet.
> jne .L60
>
> Whereas previously it was doing:
>
> movq %rbx, %rax
> leaq 8192(%rbx), %rdx <-- 8KB
> jmp .L60
> .p2align 4,,10
> .p2align 3
> .L90:
> addq $8, %rax <-- increment by 8 bytes (or sizeof(size_t))
> cmpq %rax, %rdx
> je .L61
> .L60:
> cmpq $0, (%rax) <-- checks an entire 8 bytes are zero.
>
> I didn't test how performance-critical this is, but the header comment
> for this function does use the words "cheaply detect".
>
> * This is called when a page has just been read in from disk. The idea is
> * to cheaply detect trashed pages before we go nuts following bogus line
> * pointers, testing invalid transaction identifiers, etc.
>
> so it seems a bit dangerous to switch this loop to byte-at-a-time
> rather than doing 8 bytes at once without testing the performance
> isn't affected.
Agree, I did a quick test (see [0]) and it looks like it's significantly slower
using the new inline function.
We could try to write a more elaborate version of pg_memory_is_all_zeros(), but
as it looks like there is only one use case, then it's probably better to not
implement (revert) this change here and "just" add a comment as to why pg_memory_is_all_zeros()
should not be used here, thoughts?
[0]: https://godbolt.org/z/xqnW4MPY5
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Langote | 2024-11-01 06:55:19 | Re: Wrong result when enable_partitionwise_join is on if collation of PartitionKey and Column is different. |
Previous Message | Masahiko Sawada | 2024-11-01 06:33:08 | Re: UUID v7 |