Re: define pg_structiszero(addr, s, r)

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
Cc: David Rowley <dgrowleyml(at)gmail(dot)com>, Peter Smith <smithpb2250(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: define pg_structiszero(addr, s, r)
Date: 2024-11-05 04:31:58
Message-ID: ZymfviixIBCNvzTJ@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 04, 2024 at 05:17:54PM +0000, Bertrand Drouvot wrote:
> Hi,
>
> On Tue, Nov 05, 2024 at 12:24:48AM +1300, David Rowley wrote:
> > On Sat, 2 Nov 2024 at 01:50, Bertrand Drouvot
> > <bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
> > >
> > > On Fri, Nov 01, 2024 at 09:47:05PM +1300, David Rowley wrote:
> > > > I've attached what I thought a more optimal version might look like in
> > > > case anyone thinks making it better is a good idea.
> > > >
> > >
> > > Thanks for the proposal!
> > >
> > > I like the idea, I think that's worth to add a few comments, something like:
> >
> > I'm happy if you want to pick this up and continue working on it.
>
> Sure, please find attached v1, the changes are:
>
> - switch from "const char" to "const unsigned char" (could have been done in the
> current version of pg_memory_is_all_zeros() though)
> - added some comments

+ * The test is divided into three phases for efficiency:
+ * - Initial alignment (byte per byte comparison)
+ * - Multiple bytes comparison at once
+ * - Remaining bytes (byte per byte comparison)

It does not look like this insists enough on the alignment part of the
optization? A MAXALIGN'd size would use only size_t comparisons, and
a pointer aligned would do no byte comparisons.

> - adding an Assert for ptr != 0

I'm not sure that the Assert() addition is a good idea. That could
get hot very easily depending on the caller, even if for assert
builds we don't care much about the performance, that could lead to
some paths being a lot slower.

On Fri, 01 Nov 2024 at 21:47, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> The thought of having to write a comment to warn people not to use it
> for performance-critical things makes me think it might be better just
> to write a more optimal version of the function so we don't need to
> warn people. I looked around at the callers of the function I saw the
> following numbers of bytes being used for the length: 8192 (the one in
> question), 88, 32 and 112.
>
> I don't know how performance-critical the final three of those are,
> but I imagine all apart from the 32-byte one might be better with a
> non-inlined and more optimised version of the function. The problem
> with inlining the optimised version is that it's more code to inline.

These three don't matter in terms of performance. The cycles spent
for all-zero checks of the checkpointer and bgwriter are nothing
compared to the concurrent I/O activity they handle, and the sizes are
small compared to the 8k pages. The flush callback for relation
pgstats happens once at a given interval (see around
PGSTAT_[MIN|MAX]_INTERVAL).
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2024-11-05 05:13:05 Re: Pgoutput not capturing the generated columns
Previous Message Kyotaro Horiguchi 2024-11-05 04:25:26 Re: In-placre persistance change of a relation