Re: Make tuple deformation faster

From: Andy Fan <zhihuifan1213(at)163(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Make tuple deformation faster
Date: 2024-07-02 01:07:23
Message-ID: 87h6d8vbxw.fsf@163.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

David Rowley <dgrowleyml(at)gmail(dot)com> writes:

> You can see the branch predictor has done a *much* better job in the
> patched code vs master with about 10x fewer misses. This should have
> helped contribute to the "insn per cycle" increase. 4.29 is quite
> good for postgres. I often see that around 0.5. According to [1]
> (relating to Zen4), "We get a ridiculous 12 NOPs per cycle out of the
> micro-op cache". I'm unsure how micro-ops translate to "insn per
> cycle" that's shown in perf stat. I thought 4-5 was about the maximum
> pipeline size from today's era of CPUs. Maybe someone else can explain
> better than I can. In more simple terms, generally, the higher the
> "insn per cycle", the better. Also, the lower all of the idle and
> branch miss percentages are that's generally also better. However,
> you'll notice that the patched version has more front and backend
> stalls. I assume this is due to performing more instructions per cycle
> from improved branch prediction causing memory and instruction stalls
> to occur more frequently, effectively (I think) it's just hitting the
> next bottleneck(s) - memory and instruction decoding. At least, modern
> CPUs should be able to out-pace RAM in many workloads, so perhaps it's
> not that surprising that "backend cycles idle" has gone up due to such
> a large increase in instructions per cycle due to improved branch
> prediction.

Thanks for the answer, just another area desvers to exploring.

> It would be nice to see this tested on some modern Intel CPU. A 13th
> series or 14th series, for example, or even any intel from the past 5
> years would be better than nothing.

I have two kind of CPUs.

a). Intel Xeon Processor (Icelake) for my ECS
b). Intel(R) Core(TM) i5-8259U CPU @ 2.30GHz at Mac.

My ECS reports "<not supported> branch-misses", probabaly because it
runs in virtualization software , and Mac doesn't support perf yet :(

--
Best Regards
Andy Fan

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2024-07-02 01:11:46 Re: Cleaning up perl code
Previous Message Tom Lane 2024-07-02 00:40:17 Re: ALTER TABLE SET ACCESS METHOD on partitioned tables