Re: UUID v7

From: Sergey Prokhorenko <sergeyprokhorenko(at)yahoo(dot)com(dot)au>
To: "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Cc: Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>, Michael Paquier <michael(at)paquier(dot)xyz>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, pgsql-hackers mailing list <pgsql-hackers(at)postgresql(dot)org>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Przemysław Sztoch <przemyslaw(at)sztoch(dot)pl>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Mat Arye <mat(at)timescaledb(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Nikolay Samokhvalov <samokhvalov(at)gmail(dot)com>, Junwang Zhao <zhjwpku(at)gmail(dot)com>, Stepan Neretin <sncfmgg(at)gmail(dot)com>
Subject: Re: UUID v7
Date: 2024-11-26 19:11:21
Message-ID: 298017743.2540254.1732648282007@mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Changing the name uuidv7() to uuid_v7() is a bad idea because the RFC 9562 uses the term UUIDv7, and therefore code containing uuid_v7() will not be found by searching the web in most cases.
It makes much more sense to rename it to get_uuidv7(), so that a query for "uuidv7" does not return a bunch of other unnecessary functions related to UUIDv7.

Sergey Prokhorenko sergeyprokhorenko(at)yahoo(dot)com(dot)au

On Monday 25 November 2024 at 11:12:35 pm GMT+3, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:

On Mon, Nov 25, 2024 at 10:15 AM Andrey M. Borodin <x4mmm(at)yandex-team(dot)ru> wrote:
>
>
>
> > On 25 Nov 2024, at 22:53, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > In the following code, we use "defined(__darwin__) || defined(_MSC_VER)":
> >
> > +#if defined(__darwin__) || defined(_MSC_VER)
> > +#define SUBMS_MINIMAL_STEP_BITS 10
> > +#else
> > +#define SUBMS_MINIMAL_STEP_BITS 12
> > +#endif
> > #define SUBMS_BITS    12
> > -#define SUBMS_MINIMAL_STEP_NS ((NS_PER_MS / (1 << SUBMS_BITS)) + 1)
> > +#define SUBMS_MINIMAL_STEP_NS ((NS_PER_MS / (1 <<
> > SUBMS_MINIMAL_STEP_BITS)) + 1)
> >
> > on the other hand, we use "defined(__darwin__) || defined(WIN32)" here:
> >
> > +#if defined(__darwin__) || defined(WIN32)
> > +      /*
> > +        * On MacOS real time is truncted to microseconds. Thus, 2 least
> > +        * significant are dependent on other time-specific bits, thus
> > they do not
> > +        * contribute to uniqueness. To make these bit random we mix in two bits
> > +        * from CSPRNG.
> > +        *
> > +        * SUBMS_MINIMAL_STEP is chosen so that we still guarantee monotonicity
> > +        * despite altering these bits.
> > +        */
> > +      uuid->data[7] = uuid->data[7] ^ (uuid->data[8] >> 6);
> > +#endif
> >
> > Is there a reason for using different macros?
>
> No, that's an oversight. We should mix these 2 bits if an only if SUBMS_MINIMAL_STEP_BITS=10.
>
> <tldr>
> In your review change_v33.patch you used WIN32, but it did not actually compile on Windows.
> So on Saturday I squashed v33+change_v33.patch, and composed a message that I think we still should switch to _MSC_VER. And just before sending I received your message with v36 where you used _MSC_VER :)
>
> I think this way:
> _MSC_VER - native Windows without clock_gettime, we used gettimeofday() and 10 bits of sub-ms.
> MinGW - we use clock_gettime() and 12 bits.
> Darwin - we use clock_gettime() and 10 bits.
> Anything else -  clock_gettime() and 12 bits.
> </tldr>

Thank you for the summary.

On MinGW, IIUC we can get 100-ns precision timestamps[1], so using 12
bits for calculating the minimal step would make sense.

Also, if we implement the Windows port of clock_gettime() in the
future, we can remove the part of using gettimeofday() in
get_real_time_ns_ascending(). It seems to me that it's
over-engineering to implement that part only for the UUID v7. So the
current implementation of get_real_time_ns_ascending() makes sense to
me.

>
> >
> > In get_real_time_ns_ascending(), we use _MSC_VER so we use
> > clock_gettime() on MinGW.
> >
> >>
> >> Sergey Prokhorenko just draw my attention to the new release of MariaDB [0]. They are doing very similar UUID v7 generation as we do [1].
> >>
> >
> > Thank you for the references. It made me think that we can use the
> > function name uuid_v7() rather than uuidv7().
>
> I think it's a good idea if we will be kind of SQL-compatible.
>

Okay, let"s rename it.

I've merged patches and renamed functions (also updated the commit
message). Please find the attachment.

Regards,

[1]  https://github.com/Alexpux/mingw-w64/blob/d0d7f784833bbb0b2d279310ddc6afb52fe47a46/mingw-w64-libraries/winpthreads/src/clock.c#L119

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

  • Re: UUID v7 at 2024-11-25 20:11:55 from Masahiko Sawada

Responses

  • Re: UUID v7 at 2024-11-26 19:30:29 from Masahiko Sawada

Browse pgsql-hackers by date

  From Date Subject
Next Message Dmitry Dolgov 2024-11-26 19:17:58 Re: Changing shared_buffers without restart
Previous Message Bruce Momjian 2024-11-26 19:04:15 Re: Doc: typo in config.sgml