Re: UUID v7

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: Andrey Borodin <x4mmm(at)yandex-team(dot)ru>
Cc: Daniel Verite <daniel(at)manitou-mail(dot)org>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>, Sergey Prokhorenko <sergeyprokhorenko(at)yahoo(dot)com(dot)au>, Przemysław Sztoch <przemyslaw(at)sztoch(dot)pl>, Michael Paquier <michael(at)paquier(dot)xyz>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, Pgsql-Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Mat Arye <mat(at)timescaledb(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Nikolay Samokhvalov <samokhvalov(at)gmail(dot)com>, Junwang Zhao <zhjwpku(at)gmail(dot)com>, Stepan Neretin <sncfmgg(at)gmail(dot)com>
Subject: Re: UUID v7
Date: 2025-01-31 18:49:48
Message-ID: CAD21AoDfHfF8=h5vauLOmVSxS6dnNGHmqT4h0gSoj1jM-qRjnQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 30, 2025 at 11:09 PM Andrey Borodin <x4mmm(at)yandex-team(dot)ru> wrote:
>
>
>
> > On 31 Jan 2025, at 00:54, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > I like this idea. Would you like to write a patch, or shall I?
>
> I propose to separate milliseconds from nanoseconds. Please find attached implementation of this.
> With this patch we can generate correct UUIDs in a very distant future.
> postgres=# select x, uuid_extract_timestamp(uuidv7((x::text || ' year'::text)::interval)),
> (x::text || ' year'::text)::interval
> from generate_series(1,9000,1000) x;
> x | uuid_extract_timestamp | interval
> ------+-----------------------------+------------
> 1 | 2026-01-31 12:00:53.084+05 | 1 year
> 1001 | 3026-01-31 12:00:53.084+05 | 1001 years
> 2001 | 4026-01-31 12:00:53.084+05 | 2001 years
> 3001 | 5026-01-31 12:00:53.084+05 | 3001 years
> 4001 | 6026-01-31 12:00:53.084+05 | 4001 years
> 5001 | 7026-01-31 12:00:53.085+05 | 5001 years
> 6001 | 8026-01-31 12:00:53.085+05 | 6001 years
> 7001 | 9026-01-31 12:00:53.085+05 | 7001 years
> 8001 | 10026-01-31 12:00:53.085+05 | 8001 years
> (9 rows)

Thank you for the patch! I agree with the basic direction of this fix.
Here are some review comments:

---
-static inline int64 get_real_time_ns_ascending();
+static inline uint64 get_real_time_ns_ascending();

IIUC we don't need to replace int64 with uint64 if we have two
separate parameters for generate_uuidv7(). It seems to be conventional
to use a signed int for timestamps.

---
Need to update the function comment of generate_uuidv7() as we changed
the function arguments.

---
- ns = (ts + (POSTGRES_EPOCH_JDATE - UNIX_EPOCH_JDATE) *
SECS_PER_DAY * USECS_PER_SEC)
- * NS_PER_US + ns % NS_PER_US;
+ us = (ts + (POSTGRES_EPOCH_JDATE - UNIX_EPOCH_JDATE) *
SECS_PER_DAY * USECS_PER_SEC);

/* Generate an UUIDv7 */
- uuid = generate_uuidv7(ns);
+ uuid = generate_uuidv7(us / 1000, (us % 1000) * 1000 + ns % NS_PER_US);

I think we can have an inline function or a marco (or use TMODULO()?)
to split nanoseconds into milliseconds and sub-milliseconds so that
uuidv7() and uuidv7_interval() can pass them to generate_uuidv7().

The comments in uuidv7_interval() also need to be updated accordingly.

---
I think we need to consider how we can handle the timestamp shifting.
UUIDv7 contains 48 bits Unix timestamp at milliseconds precision,
which can represent timestamps approximately between 2493 BC and 6432
AC. If users specify an interval to shift the timestamp beyond the
range, 48-bits timestamp would be wrapped around and they would not be
able to get an expected result. Do we need to raise an error in that
case?

---
Another problem I found in uuid_extract_timestamp() is that it cannot
correctly extract a timestamp before 1970/1/1 stored in a UUIDv7
value:

postgres(1:1795331)=# select year, uuid_extract_timestamp(uuidv7((year
|| 'year ago')::interval)) from generate_series(54, 56) year;
year | uuid_extract_timestamp
------+-----------------------------
54 | 1971-01-31 10:46:25.111-08
55 | 1970-01-31 10:46:25.111-08
56 | 10888-09-01 17:18:15.768-07
(3 rows)

The problem is that we correctly store a negative timestamp value in a
UUIDv7 value but uuid_extract_timestamp() unconditionally treats it as
a positive timestamp value. I think this is a separate bug we need to
fix.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

In response to

  • Re: UUID v7 at 2025-01-31 07:08:45 from Andrey Borodin

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2025-01-31 19:00:49 Re: Increase of maintenance_work_mem limit in 64-bit Windows
Previous Message James Hunter 2025-01-31 17:22:50 Re: RFC: Packing the buffer lookup table