Re: define pg_structiszero(addr, s, r)

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: define pg_structiszero(addr, s, r)
Date: 2024-11-01 06:27:38
Message-ID: ZyR02ofHiWG1HmLI@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Nov 01, 2024 at 05:45:44PM +1300, David Rowley wrote:
> I think this should be passing BLCKSZ rather than (BLCKSZ /
> sizeof(size_t)), otherwise, it'll just check the first 1 kilobyte is
> zero rather than the entire page.

Ugh, Friday brain fart. The attached should be able to fix that, this
brings back the movl to its correct path:
- movl $1024, %esi
+ movl $8192, %esi

Does that look fine to you?

> I didn't test how performance-critical this is, but the header comment
> for this function does use the words "cheaply detect".

Under gcc -O2 or -O3, the single-byte check or the 8-byte check don't
make a difference. Please see the attached (allzeros.txt) for a quick
check if you want to check by yourself. With 1M iterations, both
average around 3ms for 1M iterations on my laptop (not the fastest
thing around).

Under -O0, though, the difference is noticeable:
- 1-byte check: 3.52s for 1M iterations, averaging one check at
3.52ns.
- 8-byte check: 0.46s for 1M iterations, averaging one check at
0.46ns.

Even for that, I doubt that this is going to be noticeable in
practice, still the difference exists.
--
Michael

Attachment Content-Type Size
allzeros.c text/x-csrc 699 bytes
page-allzeros.patch text/x-diff 729 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Richard Guo 2024-11-01 06:30:30 Re: Eager aggregation, take 3
Previous Message Richard Guo 2024-11-01 06:21:06 Re: Eager aggregation, take 3