From: | Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> |
---|---|
To: | amitdkhan(dot)pg(at)gmail(dot)com |
Cc: | robertmhaas(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: asynchronous and vectorized execution |
Date: | 2016-09-12 09:02:59 |
Message-ID: | 20160912.180259.48009563.horiguchi.kyotaro@lab.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hello,
At Thu, 01 Sep 2016 16:12:31 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp> wrote in <20160901(dot)161231(dot)110068639(dot)horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
> There's perfomance degradation for non-asynchronous nodes, as
> shown as 't0' below.
>
> The patch adds two "if-then" and one additional function call as
> asynchronous stuff into ExecProcnode, which is heavily passed and
> foremerly consists only five meaningful lines. The stuff slows
> performance by about 1% for simple seqscan case. The following is
> the performance numbers previously shown upthread. (Or the
> difference might be too small to get meaningful performance
> difference..)
I tried __builtin_expect before moving the stuff out of
execProcNode. (attached patch) I found a conversation about the
pragma in past discussion.
> If we can show cases where it reliably produces a significant
> speedup, then I would think it would be worthwhile
I got a result as the followings.
master(67e1e2a)-O2
time(ms) stddev(ms)
t0: 3928.22 ( 0.40) # Simple SeqScan only
pl: 1665.14 ( 0.53) # Append(SeqScan)
Patched-O2 / NOT Use __builtin_expect
t0: 4042.69 ( 0.92) degradation to master is 2.9%
pl: 1698.46 ( 0.44) degradation to master is 2.0%
Patched-O2 / Use __builtin_expect
t0: 3886.69 ( 1.93) *gain* to master is 1.06%
pl: 1671.66 ( 0.67) degradation to master is 0.39%
I haven't directly seen the pragmra's implication for
optimization on surrounding code but I suspect there's some
implication. I also tried the pragma to ExecAppend but no
difference seen. The numbers flucture easily by any changes in
the machine's state so the lower digits aren't trustworthy but
several succeeding repetitions showed fluctuations up to some
milliseconds.
execProcNode will be allowed to be as it is if __builtin_expect
is usable but ExecAppend still needs an improvement.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachment | Content-Type | Size |
---|---|---|
0001-Use-__builtin_expect-to-optimize-branches.patch | text/x-patch | 1.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Pavan Deolasee | 2016-09-12 10:12:16 | Re: Refactoring of heapam code. |
Previous Message | Michael Paquier | 2016-09-12 08:10:22 | Re: Re: [COMMITTERS] pgsql: Use LEFT JOINs in some system views in case referenced row doesn |