From: | Andres Freund <andres(at)anarazel(dot)de> |
---|---|
To: | Tomas Vondra <tomas(at)vondra(dot)me> |
Cc: | Melanie Plageman <melanieplageman(at)gmail(dot)com>, Nazir Bilal Yavuz <byavuz81(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: BitmapHeapScan streaming read user and prelim refactoring |
Date: | 2025-02-14 18:44:36 |
Message-ID: | xif2lgn7obsi5brj7llkzomcia2pn5nwqlyjnkjruknknclbws@vgw2kaldktxw |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
On 2025-02-14 18:18:47 +0100, Tomas Vondra wrote:
> FWIW this does not change anything in the detection of sequential access
> patterns, discussed nearby, because the benchmarks started before Andres
> looked into that. If needed, I can easily rerun these tests, I just need
> a patch to apply.
>
> But if there really is some sort of issue, it'd make sense why it's much
> worse on the older SATA SSDs, while NVMe devices perform somewhat
> better. Because AFAICS the NVMe devices are better at handling random
> I/O with shorter queues.
I think the results are complicated because there are two counteracting
factors influencing performance:
1) read stream doing larger reads -> considerably faster
2) read stream not doing prefetching -> more IO stalls
1) will be a a bigger boon on disks where you're not bottlenecked as much by
interface limits. Whereas SATA is limited to ~500MB/s, NVMe started out at
3GB/s. So this gain will matter more on NVMes.
At least on my machine 2) is what causes CPU idle states to kick in, which is
what causes a good bit of the slowdown. How expensive the idle states are,
how quickly they kick in, etc seems to depend a lot on CPU model, bios
settings and "platform settings" (mainboard manufacturer settings).
The worse a disk is at random IO, the longer the stalls are (adding time), the
deeper idle state can be reached (further increasing latency). I.e. SATA will
be worse.
It might be interesting to run the benchmark with cpu idle stats disabled, at
least on the subset of cores you run the test on. E.g.
cpupower -c 13 idle-set -D1
will disable idle states that have a transition time worse than 1us for core
13.
Sometimes disabling idle states for all cores will have deliterious effects,
due to reducing the thermal budget for turbo boost. E.g. on my older
workstation a core can boost to 3.4GHz if the whole system is at -E and only
3GHz at -D0.
Instead of disabling idle states, you could also just monitor them (cpupower
monitor <benchmark> or turbostat --quiet <benchmark>).
Greetings,
Andres Freund
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2025-02-14 18:52:19 | Re: BackgroundPsql swallowing errors on windows |
Previous Message | Melanie Plageman | 2025-02-14 18:30:20 | Re: Confine vacuum skip logic to lazy_scan_skip |