From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | Sergey Prokhorenko <sergeyprokhorenko(at)yahoo(dot)com(dot)au> |
Cc: | "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>, Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>, Michael Paquier <michael(at)paquier(dot)xyz>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, pgsql-hackers mailing list <pgsql-hackers(at)postgresql(dot)org>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Przemysław Sztoch <przemyslaw(at)sztoch(dot)pl>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Mat Arye <mat(at)timescaledb(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Nikolay Samokhvalov <samokhvalov(at)gmail(dot)com>, Junwang Zhao <zhjwpku(at)gmail(dot)com>, Stepan Neretin <sncfmgg(at)gmail(dot)com> |
Subject: | Re: UUID v7 |
Date: | 2024-11-15 01:44:19 |
Message-ID: | CAD21AoCHpg6a2fLhCRRv5n1eaPH39+Z+z6cS0PR_9C2JmjrHZQ@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Nov 11, 2024 at 12:20 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Sat, Nov 9, 2024 at 9:07 AM Sergey Prokhorenko
> <sergeyprokhorenko(at)yahoo(dot)com(dot)au> wrote:
> >
> > On Saturday 9 November 2024 at 01:00:15 am GMT+3, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > > the microsecond part is working also as a counter in a sense. IT seems fine to me but I'm slightly concerned that there is no guidance of such implementation in RFC 9562.
> >
> > In fact, there is guidance of similar implementation in RFC 9562:
> > https://datatracker.ietf.org/doc/html/rfc9562#name-monotonicity-and-counters
> > "Counter Rollover Handling:"
> > "Alternatively, implementations MAY increment the timestamp ahead of the actual time and reinitialize the counter."
> >
>
> Indeed, thank you.
>
> > But in the near future, this may not be enough for the highest-performance systems.
>
> Yeah, I'm concerned about this. That time might gradually come. That
> being said, as long as rand_a part works also as a counter, it's fine.
> Also, 12 bits does not differ much as Andrey Borodin mentioned. I
> think in the first version it's better to start with a simple
> implementation rather than over-engineering it.
>
> Regarding the implementation, the v30 patch uses only microseconds
> precision time even on platforms where nanoseconds precision is
> available such as Linux. I think it's better to store the value of
> (sub-milliseconds * 4096) into 12-bits of rand_a space instead of
> directly storing microseconds into 10 bits space.
IIUC v29 patch implements UUIDv7 generation in this way. So I've
reviewed v29 patch and here are some review comments:
---
* Set magic numbers for a "version 4" (pseudorandom) UUID, see
- * http://tools.ietf.org/html/rfc4122#section-4.4
+ * http://tools.ietf.org/html/rfc9562#section-4.4
*/
The new RFC doesn't have section 4.4.
---
+ * All UUID bytes are filled with strong random numbers except version and
+ * variant 0b10 bits.
I'm concerned that "version and variant 0b10 bits" is not very clear
to readers. I think we can just mention "... except version and
variant bits".
---
+
+#ifndef WIN32
+#include <time.h>
+
+static uint64 get_real_time_ns()
+{
+ struct timespec tmp;
+
+ clock_gettime(CLOCK_REALTIME, &tmp);
+ return tmp.tv_sec * 1000000000L + tmp.tv_nsec;
+}
+#else /* WIN32 */
+
+#include "c.h"
+#include <sysinfoapi.h>
+#include <sys/time.h>
+
+/* FILETIME of Jan 1 1970 00:00:00, the PostgreSQL epoch */
+static const unsigned __int64 epoch = UINT64CONST(116444736000000000);
+
+/*
+ * FILETIME represents the number of 100-nanosecond intervals since
+ * January 1, 1601 (UTC).
+ */
+#define FILETIME_UNITS_TO_NS UINT64CONST(100)
+
+
+/*
+ * timezone information is stored outside the kernel so tzp isn't used anymore.
+ *
+ * Note: this function is not for Win32 high precision timing purposes. See
+ * elapsed_time().
+ */
+static uint64
+get_real_time_ns()
+{
+ FILETIME file_time;
+ ULARGE_INTEGER ularge;
+
+ GetSystemTimePreciseAsFileTime(&file_time);
+ ularge.LowPart = file_time.dwLowDateTime;
+ ularge.HighPart = file_time.dwHighDateTime;
+
+ return (ularge.QuadPart - epoch) * FILETIME_UNITS_TO_NS;
+}
+#endif
I think that it's better to implement these functions in instr_time.h
or another file.
---
+/* minimum amount of ns that guarantees step of increased_clock_precision */
+#define SUB_MILLISECOND_STEP (1000000/4096 + 1)
I think we can rewrite it to:
#define NS_PER_MS INT64CONST(1000000)
#define SUB_MILLISECOND_STEP ((NS_PER_MS / (1 << 12)) + 1)
Which improves the readability.
Also, I think "#define NS_PER_US INT64CONST(1000)" can also be used in
many places.
---
+ /* set version field, top four bits are 0, 1, 1, 1 */
+ uuid->data[6] = (uuid->data[6] & 0x0f) | 0x70;
+ /* set variant field, top two bits are 1, 0 */
+ uuid->data[8] = (uuid->data[8] & 0x3f) | 0x80;
I think we can make an inline function to set both variant and version
so we can use it for generating UUIDv4 and UUIDv7.
--
+ tms = uuid->data[5];
+ tms += ((uint64) uuid->data[4]) << 8;
+ tms += ((uint64) uuid->data[3]) << 16;
+ tms += ((uint64) uuid->data[2]) << 24;
+ tms += ((uint64) uuid->data[1]) << 32;
+ tms += ((uint64) uuid->data[0]) << 40;
How about rewriting these to the following for consistency with UUIDv1 codes?
tms = uuid->data[5]
+ ((uint64) uuid->data[4] << 8)
+ ((uint64) uuid->data[3] << 16)
+ ((uint64) uuid->data[2] << 24)
+ ((uint64) uuid->data[1] << 32)
+ ((uint64) uuid->data[0] << 40);
---
Thinking about the function structures more, I think we can refactor
generate_uuidv7(), uuidv7() and uuidv7_interval():
- create a function, get_clock_timestamp_ns(), that provides a
nanosecond-precision timestamp
- the returned timestamp is guaranteed to be greater than the
previous returned value.
- this function can be inlined.
- create a function, generate_uuidv7(), that takes a
nanosecond-precision timestamp as a function argument, and generate
UUIDv7 based on it.
- this function can be inlined too.
- uuidv7() gets the timestamp from get_clock_timestamp_ns() and passes
it to generate_uuidv7().
- uuidv7() gets the timestamp from get_clock_timestamp_ns(), adjusts
it based on the given interval, and passes it to generate_uuidv7().
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | torikoshia | 2024-11-15 01:51:58 | Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row |
Previous Message | Peter Smith | 2024-11-15 00:39:46 | Re: Improve the error message for logical replication of regular column to generated column. |