From: | Lukas Fittl <lukas(at)fittl(dot)com> |
---|---|
To: | Andres Freund <andres(at)anarazel(dot)de> |
Cc: | David Geier <geidav(dot)pg(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Ibrar Ahmed <ibrar(dot)ahmad(at)gmail(dot)com>, Maciek Sakrejda <m(dot)sakrejda(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Reduce timing overhead of EXPLAIN ANALYZE using rdtsc? |
Date: | 2025-03-01 07:45:58 |
Message-ID: | CAP53PkzO2KpscD-tgFW_V-4WS+vkniH4-B00eM-e0bsBF-xUxg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Jun 2, 2024 at 1:08 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> At some point this patch switched from rdtsc to rdtscp, which imo largely
> negates the point of it. What lead to that?
From what I can gather, it appears this was an oversight when David first
reapplied the work on the instr_time changes that were committed.
I've come back to this and rebased this, as well as:
- Corrected the use of RDTSCP to RDTSC in pg_get_ticks_fast
- Check 16H register if 15H register does not contain frequency information
(per research, relevant for some CPUs)
- Fixed incorrect reporting in pg_test_timing due to too small histogram
(32 => 64 bits)
- Fixed indentation per pgindent
- Added support for VMs running under KVM/VMware Hypervisors
On that last item, this does indeed make a difference on VMs, contrary to
the code comment in earlier versions (and I've not seen any odd behaviors
again, FWIW):
On a c5.xlarge (Skylake-SP or Cascade Lake) on AWS, with the same test as
done initially in this thread:
SELECT COUNT(*) FROM lotsarows;
Time: 974.423 ms
EXPLAIN (ANALYZE, TIMING OFF) SELECT COUNT(*) FROM lotsarows;
Time: 1336.196 ms (00:01.336)
Without patch:
EXPLAIN (ANALYZE) SELECT COUNT(*) FROM lotsarows;
Time: 2165.069 ms (00:02.165)
Per loop time including overhead: 22.15 ns
With patch:
EXPLAIN (ANALYZE, TIMING ON) SELECT COUNT(*) FROM lotsarows;
Time: 1654.289 ms (00:01.654)
Per loop time including overhead: 9.81 ns
I'm registering this again in the current commitfest to help reviews.
Open questions I have:
- Could we rely on checking whether the TSC timesource is invariant (via
CPUID), instead of relying on Linux choosing it as a clocksource?
- For the Hypervisor CPUID checks I had to rely on __cpuidex which is only
available on newer GCC versions (
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95973) how do we best check
for its presence? (compiler version, or rather configure check?) -- note
this is also the reason the patch fails the clang compiler warning check in
CI, despite clang having support in recent versions (
https://reviews.llvm.org/D121653)
Thanks,
Lukas
--
Lukas Fittl
Attachment | Content-Type | Size |
---|---|---|
v10-0001-instr_time-Add-INSTR_TIME_SET_SECONDS-INSTR_TIME.patch | application/octet-stream | 2.1 KB |
v10-0003-Use-time-stamp-counter-to-measure-time-on-Linux-.patch | application/octet-stream | 17.6 KB |
v10-0002-wip-report-nanoseconds-in-pg_test_timing.patch | application/octet-stream | 11.3 KB |
From | Date | Subject | |
---|---|---|---|
Previous Message | Pavel Stehule | 2025-03-01 07:23:24 | Re: Re: proposal: schema variables |