From: | Michael Paquier <michael(at)paquier(dot)xyz> |
---|---|
To: | David Rowley <dgrowleyml(at)gmail(dot)com> |
Cc: | Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: define pg_structiszero(addr, s, r) |
Date: | 2024-11-01 06:27:38 |
Message-ID: | ZyR02ofHiWG1HmLI@paquier.xyz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Nov 01, 2024 at 05:45:44PM +1300, David Rowley wrote:
> I think this should be passing BLCKSZ rather than (BLCKSZ /
> sizeof(size_t)), otherwise, it'll just check the first 1 kilobyte is
> zero rather than the entire page.
Ugh, Friday brain fart. The attached should be able to fix that, this
brings back the movl to its correct path:
- movl $1024, %esi
+ movl $8192, %esi
Does that look fine to you?
> I didn't test how performance-critical this is, but the header comment
> for this function does use the words "cheaply detect".
Under gcc -O2 or -O3, the single-byte check or the 8-byte check don't
make a difference. Please see the attached (allzeros.txt) for a quick
check if you want to check by yourself. With 1M iterations, both
average around 3ms for 1M iterations on my laptop (not the fastest
thing around).
Under -O0, though, the difference is noticeable:
- 1-byte check: 3.52s for 1M iterations, averaging one check at
3.52ns.
- 8-byte check: 0.46s for 1M iterations, averaging one check at
0.46ns.
Even for that, I doubt that this is going to be noticeable in
practice, still the difference exists.
--
Michael
Attachment | Content-Type | Size |
---|---|---|
allzeros.c | text/x-csrc | 699 bytes |
page-allzeros.patch | text/x-diff | 729 bytes |
From | Date | Subject | |
---|---|---|---|
Next Message | Richard Guo | 2024-11-01 06:30:30 | Re: Eager aggregation, take 3 |
Previous Message | Richard Guo | 2024-11-01 06:21:06 | Re: Eager aggregation, take 3 |