From: | Noah Misch <noah(at)leadboat(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, "Shankaran, Akash" <akash(dot)shankaran(at)intel(dot)com>, Nathan Bossart <nathandbossart(at)gmail(dot)com>, "Amonson, Paul D" <paul(dot)d(dot)amonson(at)intel(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Popcount optimization using AVX512 |
Date: | 2024-02-10 23:52:38 |
Message-ID: | 20240210235238.eb@rfd.leadboat.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Feb 09, 2024 at 08:33:23PM -0800, Andres Freund wrote:
> On 2024-02-09 15:27:57 -0800, Noah Misch wrote:
> > On Fri, Feb 09, 2024 at 10:24:32AM -0800, Andres Freund wrote:
> > > On 2024-01-26 07:42:33 +0100, Alvaro Herrera wrote:
> > > > This suggests that finding a way to make the ifunc stuff work (with good
> > > > performance) is critical to this work.
> > >
> > > Ifuncs are effectively implemented as a function call via a pointer, they're
> > > not magic, unfortunately. The sole trick they provide is that you don't
> > > manually have to use the function pointer.
> >
> > The IFUNC creators introduced it so glibc could use arch-specific memcpy with
> > the instruction sequence of a non-pointer, extern function call, not the
> > instruction sequence of a function pointer call.
>
> My understanding is that the ifunc mechanism just avoid the need for repeated
> indirect calls/jumps to implement a single function call, not the use of
> indirect function calls at all. Calls into shared libraries, like libc, are
> indirected via the GOT / PLT, i.e. an indirect function call/jump. Without
> ifuncs, the target of the function call would then have to dispatch to the
> resolved function. Ifuncs allow to avoid this repeated dispatch by moving the
> dispatch to the dynamic linker stage, modifying the contents of the GOT/PLT to
> point to the right function. Thus ifuncs are an optimization when calling a
> function in a shared library that's then dispatched depending on the cpu
> capabilities.
>
> However, in our case, where the code is in the same binary, function calls
> implemented in the main binary directly (possibly via a static library) don't
> go through GOT/PLT. In such a case, use of ifuncs turns a normal direct
> function call into one going through the GOT/PLT, i.e. makes it indirect. The
> same is true for calls within a shared library if either explicit symbol
> visibility is used, or -symbolic, -Wl,-Bsymbolic or such is used. Therefore
> there's no efficiency gain of ifuncs over a call via function pointer.
>
>
> This isn't because ifunc is implemented badly or something - the reason for
> this is that dynamic relocations aren't typically implemented by patching all
> callsites (".text relocations"), which is what you would need to avoid the
> need for an indirect call to something that fundamentally cannot be a constant
> address at link time. The reason text relocations are disfavored is that
> they can make program startup quite slow, that they require allowing
> modifications to executable pages which are disliked due to the security
> implications, and that they make the code non-shareable, as the in-memory
> executable code has to differ from the on-disk code.
>
>
> I actually think ifuncs within the same binary are a tad *slower* than plain
> function pointer calls, unless -fno-plt is used. Without -fno-plt, an ifunc is
> called by 1) a direct call into the PLT, 2) loading the target address from
> the GOT, 3) making an an indirect jump to that address. Whereas a "plain
> indirect function call" is just 1) load target address from variable 2) making
> an indirect jump to that address. With -fno-plt the callsites themselves load
> the address from the GOT.
That sounds more accurate than what I wrote. Thanks.
From | Date | Subject | |
---|---|---|---|
Next Message | jian he | 2024-02-11 00:00:00 | Re: Patch: Add parse_type Function |
Previous Message | Nathan Bossart | 2024-02-10 20:53:32 | Re: glibc qsort() vulnerability |