Re: SQLFunctionCache and generic plans

From: Tom Lane
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Cc: Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>, Alexander Korotkov <aekorotkov(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org, Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>
Subject: Re: SQLFunctionCache and generic plans
Date: 2025-02-27 19:52:40
Lists: pgsql-hackers

Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> writes:
> čt 27. 2. 2025 v 13:25 odesílatel Alexander Pyhalov <
> a(dot)pyhalov(at)postgrespro(dot)ru> napsal:
>>> Unfortunately, there is about 5% slowdown for inlined code, and for
>>> just plpgsql code too.

>> Hi. I've tried to reproduce slowdown and couldn't.

> I'll try to get profiles.

I tried to reproduce this too. What I got on my usual development
workstation (RHEL8/gcc 8.5.0 on x86_64) was:

fx2 example: v6 patch about 2.4% slower than HEAD
fx4 example: v6 patch about 7.3% slower than HEAD

I was quite concerned after that result, but then I tried it on
another machine (macOS/clang 16.0.0 on Apple M1) and got:

fx2 example: v6 patch about 0.2% slower than HEAD
fx4 example: v6 patch about 0.7% faster than HEAD

(These are average-of-three-runs tests on --disable-cassert
builds; I trust you guys were not doing performance tests on
assert-enabled builds?)

So taken together, our results are all over the map, anywhere
from 7% speedup to 7% slowdown. My usual rule of thumb is that
you can see up to 2% variation in this kind of microbenchmark even
when "nothing has changed", just due to random build details like
whether critical loops cross a cacheline or not. 7% is pretty
well above that threshold, but maybe it's just random build
variation anyway.

Furthermore, since neither example involves functions.c at all
(fx2 would be inlined, and fx4 isn't SQL-language), it's hard
to see how the patch would directly affect either example unless
it were adding overhead to plancache.c. And I don't see any
changes there that would amount to meaningful overhead for the
existing use-case with a raw parse tree.

So right at the moment I'm inclined to write this off as
measurement noise. Perhaps it'd be worth checking a few
more platforms, though.

regards, tom lane

