From: | Hannu Krosing <hannuk(at)google(dot)com> |
---|---|
To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
Cc: | Peter Eisentraut <peter(at)eisentraut(dot)org>, "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: What is a typical precision of gettimeofday()? |
Date: | 2024-06-20 10:54:55 |
Message-ID: | CAMT0RQTukdgU4frKkJQrwywdOKY95gUD-wrchcaorJo3T0CMjA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I also have a variant that uses the low-level CPU cycle counter
directly (attached)
It currently only works on clang, as it is done using
__builtin_readcyclecounter() in order to support both x64 and ARM.
This one is there to understand the overhead of the calculations when
going from cycle counter to POSIX time struct
This works OK with Clang, but we should probably not integrate this
directly into the code as it has some interesting corner cases. For
example Apple's clang does compile __builtin_readcyclecounter() but
crashes with unknown instruction when trying to run it.
Therefore I have not integrated it into Makefile so if you want to use
it, just copy it into src/bin/pg_test_timing and run
cd src/bin/pgtest_timing
mv pg_test_timing.c pg_test_timing.c.backup
cp pg_test_cyclecounter.c pg_test_timing.c
make
mv pg_test_timing pg_test_cyclecounter
mv pg_test_timing.c.backup pg_test_timing.c
It gives output like
Testing timing overhead for 3 seconds.
Total 25000001 ticks in 1000000073 ns, 24999999.175000 ticks per ns
This CPU is running at 24999999 ticks / second, will run test for 74999997 ticks
loop_count 287130958Per loop time including overhead: 10.45 ns, min: 0
ticks (0.0 ns), same: 212854591
Total ticks in: 74999997, in: 3000000541 nr
Log2 histogram of timing durations:
< ticks ( < ns) % of total running % count
1 ( 40.0) 74.1315 74.1315 212854591
2 ( 80.0) 25.8655 99.9970 74267876
4 ( 160.0) 0.0000 99.9970 7
8 ( 320.0) 0.0000 99.9970 3
16 ( 640.0) 0.0000 99.9970 1
32 ( 1280.0) 0.0000 99.9971 27
64 ( 2560.0) 0.0012 99.9983 3439
128 ( 5120.0) 0.0016 99.9999 4683
256 ( 10240.0) 0.0001 100.0000 265
512 ( 20480.0) 0.0000 100.0000 37
1024 ( 40960.0) 0.0000 100.0000 23
2048 ( 81920.0) 0.0000 100.0000 6
First 64 ticks --
0 ( 0.0) 74.1315 74.1315 212854591
1 ( 40.0) 25.8655 99.9970 74267876
2 ( 80.0) 0.0000 99.9970 2
3 ( 120.0) 0.0000 99.9970 5
4 ( 160.0) 0.0000 99.9970 2
6 ( 240.0) 0.0000 99.9983 1
13 ( 520.0) 0.0000 100.0000 1
...
59 ( 2360.0) 0.0000 100.0000 140
60 ( 2400.0) 0.0001 100.0000 210
61 ( 2440.0) 0.0002 100.0000 497
62 ( 2480.0) 0.0002 100.0000 524
63 ( 2520.0) 0.0001 100.0000 391
If you run on some interesting hardware, please share the results.
If we her enough I will put them together in a spreadsheet and share
I also attach my lightning talk slides here
---
Hannu
On Thu, Jun 20, 2024 at 12:41 PM Hannu Krosing <hannuk(at)google(dot)com> wrote:
>
> (resending to list and other CC:s )
>
> Hi Tom
>
> This is my current patch which also adds running % and optionally uses
> faster way to count leading zeros, though I did not see a change from
> that.
>
> It also bucketizes first 128 ns to get better overview of exact behaviour.
>
> We may want to put reporting this behind a flag
>
> ---
> Hannu
>
> On Wed, Jun 19, 2024 at 6:36 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >
> > Peter Eisentraut <peter(at)eisentraut(dot)org> writes:
> > > AFAICT, pg_test_timing doesn't use gettimeofday(), so this doesn't
> > > really address the original question.
> >
> > It's not exactly hard to make it do so (see attached).
> >
> > I tried this on several different machines, and my conclusion is that
> > gettimeofday() reports full microsecond precision on any platform
> > anybody is likely to be running PG on today. Even my one surviving
> > pet dinosaur, NetBSD 10 on PowerPC Mac (mamba), shows results like
> > this:
> >
> > $ ./pg_test_timing
> > Testing timing overhead for 3 seconds.
> > Per loop time including overhead: 901.41 ns
> > Histogram of timing durations:
> > < us % of total count
> > 1 10.46074 348148
> > 2 89.51495 2979181
> > 4 0.00574 191
> > 8 0.00430 143
> > 16 0.00691 230
> > 32 0.00376 125
> > 64 0.00012 4
> > 128 0.00303 101
> > 256 0.00027 9
> > 512 0.00009 3
> > 1024 0.00009 3
> >
> > I also modified pg_test_timing to measure nanoseconds not
> > microseconds (second patch attached), and got this:
> >
> > $ ./pg_test_timing
> > Testing timing overhead for 3 seconds.
> > Per loop time including overhead: 805.50 ns
> > Histogram of timing durations:
> > < ns % of total count
> > 1 19.84234 739008
> > 2 0.00000 0
> > 4 0.00000 0
> > 8 0.00000 0
> > 16 0.00000 0
> > 32 0.00000 0
> > 64 0.00000 0
> > 128 0.00000 0
> > 256 0.00000 0
> > 512 0.00000 0
> > 1024 80.14013 2984739
> > 2048 0.00078 29
> > 4096 0.00658 245
> > 8192 0.00290 108
> > 16384 0.00252 94
> > 32768 0.00250 93
> > 65536 0.00016 6
> > 131072 0.00185 69
> > 262144 0.00008 3
> > 524288 0.00008 3
> > 1048576 0.00008 3
> >
> > confirming that when the result changes it generally does so by 1usec.
> >
> > Applying just the second patch, I find that clock_gettime on this
> > old hardware seems to be limited to 1us resolution, but on my more
> > modern machines (mac M1, x86_64) it can tick at 40ns or less.
> > Even a raspberry pi 4 shows
> >
> > $ ./pg_test_timing
> > Testing timing overhead for 3 seconds.
> > Per loop time including overhead: 69.12 ns
> > Histogram of timing durations:
> > < ns % of total count
> > 1 0.00000 0
> > 2 0.00000 0
> > 4 0.00000 0
> > 8 0.00000 0
> > 16 0.00000 0
> > 32 0.00000 0
> > 64 37.59583 16317040
> > 128 62.38568 27076131
> > 256 0.01674 7265
> > 512 0.00002 8
> > 1024 0.00000 0
> > 2048 0.00000 0
> > 4096 0.00153 662
> > 8192 0.00019 83
> > 16384 0.00001 3
> > 32768 0.00001 5
> >
> > suggesting that the clock_gettime resolution is better than 64 ns.
> >
> > So I concur with Hannu that it's time to adjust pg_test_timing to
> > resolve nanoseconds not microseconds. I gather he's created a
> > patch that does more than mine below, so I'll wait for that.
> >
> > regards, tom lane
> >
Attachment | Content-Type | Size |
---|---|---|
pg_test_cyclecounter.c | text/x-csrc | 7.3 KB |
how fast is timing.pdf | application/pdf | 385.1 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Hannu Krosing | 2024-06-20 11:08:57 | Re: What is a typical precision of gettimeofday()? |
Previous Message | Hannu Krosing | 2024-06-20 10:41:54 | Re: What is a typical precision of gettimeofday()? |