From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Ants Aasma <ants(dot)aasma(at)eesti(dot)ee>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: WIP: Faster Expression Processing v4 |
Date: | 2017-03-27 05:43:14 |
Message-ID: | 20170327054314.qo6xsnlk7jcb7u2c@alap3.anarazel.de |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2017-03-25 20:59:27 -0700, Andres Freund wrote:
> On 2017-03-25 23:51:45 -0400, Tom Lane wrote:
> > Andres Freund <andres(at)anarazel(dot)de> writes:
> > > On March 25, 2017 4:56:11 PM PDT, Ants Aasma <ants(dot)aasma(at)eesti(dot)ee> wrote:
> > >> I haven't had the time to research this properly, but initial tests
> > >> show that with GCC 6.2 adding
> > >>
> > >> #pragma GCC optimize ("no-crossjumping")
> > >>
> > >> fixes merging of the op tail jumps.
> > >>
> > >> Some quick and dirty benchmarking suggests that the benefit for the
> > >> interpreter is about 15% (5% speedup on a workload that spends 1/3 in
> > >> ExecInterpExpr). My idea of prefetching op->resnull/resvalue to local
> > >> vars before the indirect jump is somewhere between a tiny benefit and
> > >> no effect, certainly not worth introducing extra complexity. Clang 3.8
> > >> does the correct thing out of the box and is a couple of percent
> > >> faster than GCC with the pragma.
> >
> > > That's large enough to be worth doing (although I recall you seeing all jumps commonalized). We should probably do this on a per function basis however (either using pragma push option, or function attributes).
> >
> > Seems like it would be fine to do it on a per-file basis.
>
> I personally find per-function annotation ala
> __attribute__((optimize("no-crossjumping")))
> cleaner anyway. I tested that, and it seems to work.
>
> Obviously we'd have to hide that behind a configure test. Could also do
> tests based on __GNUC__ / __GNUC_MINOR__, but that seems uglier.
Checking for this isn't entirely pretty - see my attached attempt at
doing so. I considered hiding
__attribute__((optimize("no-crossjumping"))) in execInterpExpr.c behind
a macro (like PG_DISABLE_CROSSJUMPING), but I don't really think that
makes things better.
Comments?
Greetings,
Andres Freund
Attachment | Content-Type | Size |
---|---|---|
0001-Disable-gcc-s-crossjumping-optimization-for-ExecInte.patch | text/x-patch | 6.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Rafia Sabih | 2017-03-27 05:48:00 | Re: [COMMITTERS] pgsql: Improve access to parallel query from procedural languages. |
Previous Message | Kyotaro HORIGUCHI | 2017-03-27 05:38:27 | Re: free space map and visibility map |