From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: WIP: Faster Expression Processing v4 |
Date: | 2017-03-25 16:22:15 |
Message-ID: | 5768.1490458935@sss.pgh.pa.us |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
More random musing ... have you considered making the jump-target fields
in expressions be relative rather than absolute indexes? That is,
EEO_JUMP would look like
op += (stepno); \
EEO_DISPATCH(); \
instead of
op = &state->steps[stepno]; \
EEO_DISPATCH(); \
I have not carried out a full patch to make this work, but just making
that one change and examining the generated assembly code looks promising.
Instead of this
movslq 40(%r14), %r8
salq $6, %r8
addq 24(%rbx), %r8
movq %r8, %r14
jmp *(%r8)
we get this
movslq 40(%r14), %rax
salq $6, %rax
addq %rax, %r14
jmp *(%r14)
which certainly looks like it ought to be faster. Also, the real reason
I got interested in this at all is that with relative jumps, groups of
steps would be position-independent within the steps array, which would
enable some compile-time tricks that seem impractical with the current
definition.
BTW, now that I've spent a bit of time looking at the generated assembly
code, I'm kind of disinclined to believe any arguments about how we have
better control over branch prediction with the jump-threading
implementation. At least with current gcc (6.3.1 on Fedora 25) at -O2,
what I see is multiple places jumping to the same indirect jump
instruction :-(. It's not a total disaster: as best I can tell, all the
uses of EEO_JUMP remain distinct. But gcc has chosen to implement about
40 of the 71 uses of EEO_NEXT by jumping to the same couple of
instructions that increment the "op" register and then do an indirect
jump :-(.
So it seems that we're at the mercy of gcc's whims as to which instruction
dispatches will be distinguishable to the hardware; which casts a very
dark shadow over any benchmarking-based arguments that X is better than Y
for branch prediction purposes. Compiler version differences are likely
to matter a lot more than anything we do.
regards, tom lane
From | Date | Subject | |
---|---|---|---|
Next Message | Stephen Frost | 2017-03-25 16:24:09 | Re: pgsql: Add COMMENT and SECURITY LABEL support for publications and subs |
Previous Message | Stephen Frost | 2017-03-25 16:21:16 | Re: Monitoring roles patch |