From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Ants Aasma <ants(dot)aasma(at)eesti(dot)ee>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: WIP: Faster Expression Processing v4 |
Date: | 2017-03-26 03:59:27 |
Message-ID: | 20170326035927.5mubkfdtaqlrgm2d@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2017-03-25 23:51:45 -0400, Tom Lane wrote:
> Andres Freund <andres(at)anarazel(dot)de> writes:
> > On March 25, 2017 4:56:11 PM PDT, Ants Aasma <ants(dot)aasma(at)eesti(dot)ee> wrote:
> >> I haven't had the time to research this properly, but initial tests
> >> show that with GCC 6.2 adding
> >>
> >> #pragma GCC optimize ("no-crossjumping")
> >>
> >> fixes merging of the op tail jumps.
> >>
> >> Some quick and dirty benchmarking suggests that the benefit for the
> >> interpreter is about 15% (5% speedup on a workload that spends 1/3 in
> >> ExecInterpExpr). My idea of prefetching op->resnull/resvalue to local
> >> vars before the indirect jump is somewhere between a tiny benefit and
> >> no effect, certainly not worth introducing extra complexity. Clang 3.8
> >> does the correct thing out of the box and is a couple of percent
> >> faster than GCC with the pragma.
>
> > That's large enough to be worth doing (although I recall you seeing all jumps commonalized). We should probably do this on a per function basis however (either using pragma push option, or function attributes).
>
> Seems like it would be fine to do it on a per-file basis.
I personally find per-function annotation ala
__attribute__((optimize("no-crossjumping")))
cleaner anyway. I tested that, and it seems to work.
Obviously we'd have to hide that behind a configure test. Could also do
tests based on __GNUC__ / __GNUC_MINOR__, but that seems uglier.
> If you're
> worried about pessimizing the out-of-line subroutines, we could move
> those to a different file --- it's pretty questionable that they're
> in execExprInterp.c in the first place, considering they're meant to be
> used by more than just that execution method.
I indeed am, but having the code in the same file has a minor advantage:
It allows the compiler to partially inline them, if it feels like it
(e.g. moving null checks inline).
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2017-03-26 04:52:39 | Re: Re: [COMMITTERS] pgsql: Faster expression evaluation and targetlist projection. |
Previous Message | Tom Lane | 2017-03-26 03:51:45 | Re: WIP: Faster Expression Processing v4 |