From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru> |
Cc: | Sergey Prokhorenko <sergeyprokhorenko(at)yahoo(dot)com(dot)au>, Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>, Michael Paquier <michael(at)paquier(dot)xyz>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, pgsql-hackers mailing list <pgsql-hackers(at)postgresql(dot)org>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Przemysław Sztoch <przemyslaw(at)sztoch(dot)pl>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Mat Arye <mat(at)timescaledb(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Nikolay Samokhvalov <samokhvalov(at)gmail(dot)com>, Junwang Zhao <zhjwpku(at)gmail(dot)com>, Stepan Neretin <sncfmgg(at)gmail(dot)com> |
Subject: | Re: UUID v7 |
Date: | 2024-11-01 21:23:30 |
Message-ID: | CAD21AoC4iAr7M_OgtHA0HZMezot68_0vwUCQjjXKk2iW89w0Jg@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Nov 1, 2024 at 10:33 AM Andrey M. Borodin <x4mmm(at)yandex-team(dot)ru> wrote:
>
>
>
> > On 31 Oct 2024, at 23:04, Stepan Neretin <sndcppg(at)gmail(dot)com> wrote:
> >
> >
> > Firstly, I'd like to discuss the increased_clock_precision variable, which
> > currently divides the timestamp into milliseconds and nanoseconds. However,
> > this approach only approximates the extra bits for sub-millisecond
> > precision, leading to imprecise timestamps in high-frequency UUID
> > generation.
> No, timestamp is taken in nanoseconds, we keep precision of 1/4096 of ms. If you observe precision loss anywhere let me know.
>
> >
> > To address this issue, we could consider using a more accurate method for
> > calculating the timestamp. For instance, we could utilize a higher
> > resolution clock or implement a more precise algorithm to ensure accurate
> > timestamps.
>
> That's what we do.
>
> >
> > Additionally, it would be beneficial to add validation checks for the
> > interval argument. These checks could verify that the input interval is
> > within reasonable bounds and that the calculated timestamp is accurate.
> > Examples of checks could include verifying if the interval is too small,
> > too large, or exceeds the maximum possible number of milliseconds and
> > nanoseconds in a timestamp.
>
> timestamptz_pl_interval() is already doing this.
>
> > What do you think about these suggestions? Let me know your thoughts!
>
> Thanks a lot for reviewing the patch!
>
>
> > On 1 Nov 2024, at 10:33, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Thu, Oct 31, 2024 at 9:53 PM Andrey M. Borodin <x4mmm(at)yandex-team(dot)ru> wrote:
> >>
> >>
> >>
> >>> On 1 Nov 2024, at 03:00, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >>>
> >>> Therefore, if the
> >>> system clock moves backward due to NTP, we cannot guarantee
> >>> monotonicity and sortability. Is that right?
> >>
> >> Not exactly. Monotonicity is ensured for a given backend. We make sure that timestamp is advanced at least for ~250ns forward on each UUID generation. 60 bits of time are unique and ascending for a given backend.
> >>
> >
> > Thank you for your explanation. I now understand this code guarantees
> > the monotonicity:
> >
> > +/* minimum amount of ns that guarantees step of increased_clock_precision */
> > +#define SUB_MILLISECOND_STEP (1000000/4096 + 1)
> > + ns = get_real_time_ns();
> > + if (previous_ns + SUB_MILLISECOND_STEP >= ns)
> > + ns = previous_ns + SUB_MILLISECOND_STEP;
> > + previous_ns = ns;
> >
> >
> > I think that one of the most important parts in UUIDv7 implementation
> > is which method (1, 2, or 3 described in RFC 9562) we use to guarantee
> > the monotonicity. The current patch employs method 3 with the
> > assumption that 12 bits of sub-millisecond information is available on
> > most of the systems we support. However, as far as I tested, on MacOS,
> > values returned by clock_gettime(CLOCK_REALTIME) are only microsecond
> > precision, meaning that we could waste some randomness. Has this point
> > been considered?
> >
>
> There was a thread "What is a typical precision of gettimeofday()?" [0]
> There we found out that routines of instr_time.h are precise enough. On my machine (MacBook Air M3) I do not observe significant differences between CLOCK_MONOTONIC_RAW and CLOCK_REALTIME in pg_test_timing results.
>
> CLOCK_MONOTONIC_RAW
> x4mmm(at)x4mmm-osx bin % ./pg_test_timing
> Testing timing overhead for 3 seconds.
> Per loop time including overhead: 15.30 ns
> Histogram of timing durations:
> < us % of total count
> 1 98.47856 193113929
> 2 1.52039 2981452
> 4 0.00025 485
> 8 0.00062 1211
> 16 0.00012 237
> 32 0.00004 79
> 64 0.00002 30
> 128 0.00000 8
> 256 0.00000 5
> 512 0.00000 3
> 1024 0.00000 1
> 2048 0.00000 2
>
> CLOCK_REALTIME
> x4mmm(at)x4mmm-osx bin % ./pg_test_timing
> Testing timing overhead for 3 seconds.
> Per loop time including overhead: 15.04 ns
> Histogram of timing durations:
> < us % of total count
> 1 98.49709 196477842
> 2 1.50268 2997479
> 4 0.00007 130
> 8 0.00012 238
> 16 0.00005 91
> 32 0.00000 4
> 64 0.00000 1
I applied the patch shared on that thread[1] to measure nanoseconds
and changed instr_time.h to use CLOCK_REALTIME even on macOS. Here is
the results on my machine (macOS 14.7, M1 Pro):
Testing timing overhead for 3 seconds.
Per loop time including overhead: 18.61 ns
Histogram of timing durations:
<= ns % of total running % count
0 98.1433 98.1433 158212921
1 0.0000 98.1433 0
3 0.0000 98.1433 0
7 0.0000 98.1433 0
15 0.0000 98.1433 0
31 0.0000 98.1433 0
63 0.0000 98.1433 0
127 0.0000 98.1433 0
255 0.0000 98.1433 0
511 0.0000 98.1433 0
1023 1.8560 99.9994 2992054
2047 0.0000 99.9994 51
4095 0.0001 99.9995 110
8191 0.0003 99.9998 463
16383 0.0002 100.0000 313
32767 0.0000 100.0000 49
65535 0.0000 100.0000 4
Timing durations less than 128 ns:
ns % of total running % count
0 98.1433 98.1433 158212921
Most of the timing durations were nanoseconds and fell into either 0
ns. Others fell into >1023 bins.
I've done a simple test as well on my Mac and saw that the time
returned by clock_gettime(CLOCK_REALTIME) doesn't have nanosecond
precision:
% cat test.c
#include <stdio.h>
#include <time.h>
#include <sys/time.h>
int
main(void)
{
struct timespec real;
struct timespec mono;
struct timespec mono_raw;
clock_gettime(CLOCK_REALTIME, &real);
clock_gettime(CLOCK_MONOTONIC, &mono);
clock_gettime(CLOCK_MONOTONIC_RAW, &mono_raw);
printf("real: %ld\t%ld\n", real.tv_sec, real.tv_nsec);
printf("mono: %ld\t%ld\n", mono.tv_sec, mono.tv_nsec);
printf("mono_raw: %ld\t%ld\n", mono_raw.tv_sec, mono_raw.tv_nsec);
return 0;
}
% gcc -o test test.c
% ./test
real: 1730495955 515018000
mono: 3212977 834578000
mono_raw: 3212982 962799958
% ./test
real: 1730495956 78927000
mono: 3212978 398488000
mono_raw: 3212983 526718333
% ./test
real: 1730495956 652751000
mono: 3212978 972312000
mono_raw: 3212984 100552333
Regards,
[1] https://www.postgresql.org/message-id/3110108.1719939353%40sss.pgh.pa.us
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Masahiko Sawada | 2024-11-01 21:28:35 | Re: New "raw" COPY format |
Previous Message | Bruce Momjian | 2024-11-01 20:30:36 | Re: Should we document how column DEFAULT expressions work? |