From: | Andrey Borodin <x4mmm(at)yandex-team(dot)ru> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | Daniel Verite <daniel(at)manitou-mail(dot)org>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>, Sergey Prokhorenko <sergeyprokhorenko(at)yahoo(dot)com(dot)au>, Przemysław Sztoch <przemyslaw(at)sztoch(dot)pl>, Michael Paquier <michael(at)paquier(dot)xyz>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, Pgsql-Hackers Mailing List <pgsql-hackers(at)postgresql(dot)org>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Mat Arye <mat(at)timescaledb(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Nikolay Samokhvalov <samokhvalov(at)gmail(dot)com>, Junwang Zhao <zhjwpku(at)gmail(dot)com>, Stepan Neretin <sncfmgg(at)gmail(dot)com> |
Subject: | Re: UUID v7 |
Date: | 2025-02-09 17:07:17 |
Message-ID: | 11C1A15A-73D9-4690-BC88-CE17A607C2FE@yandex-team.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I've took into account note from Sergey that "offset" is better name for uuidv7() argument than "shift".
> On 5 Feb 2025, at 03:02, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
>>
>> I was thinking about incorporating test like this.
>>
>>>> With this patch we can generate correct UUIDs in a very distant future.
>>>> postgres=# select x, uuid_extract_timestamp(uuidv7((x::text || ' year'::text)::interval)),
>>>> (x::text || ' year'::text)::interval
>>>> from generate_series(1,9000,1000) x;
>>>> x | uuid_extract_timestamp | interval
>>>> ------+-----------------------------+------------
>>>> 1 | 2026-01-31 12:00:53.084+05 | 1 year
>>>> 1001 | 3026-01-31 12:00:53.084+05 | 1001 years
>>>> 2001 | 4026-01-31 12:00:53.084+05 | 2001 years
>>>> 3001 | 5026-01-31 12:00:53.084+05 | 3001 years
>>>> 4001 | 6026-01-31 12:00:53.084+05 | 4001 years
>>>> 5001 | 7026-01-31 12:00:53.085+05 | 5001 years
>>>> 6001 | 8026-01-31 12:00:53.085+05 | 6001 years
>>>> 7001 | 9026-01-31 12:00:53.085+05 | 7001 years
>>>> 8001 | 10026-01-31 12:00:53.085+05 | 8001 years
>>>> (9 rows)
>>
>
> Something like following queries might be workable for example?
>
> create table test (c serial, d uuid, t timestamptz generated always as
> (uuid_extract_timestamp(d)) stored);
> insert into test (d) select uuidv7((n || 'years')::interval) from
> generate_series(1, 2000) n;
> select count(*) from (select t - lag(t) over (order by c) as diff from
> test) where diff > '10 year' ;
Yeah, makes sense. I reduced tolerance to 366+1 day. Must be stable if we've done all the time offset business right.
> Here are some review comments:
>
> #define NS_PER_S INT64CONST(1000000000)
> #define NS_PER_MS INT64CONST(1000000)
> +#define US_PER_MS INT64CONST(1000)
> #define NS_PER_US INT64CONST(1000)
>
> I think it's clear if we put US_PER_MS below NS_PER_US.
OK.
> ---
> *
> - * ns is a number of nanoseconds since start of the UNIX epoch. This value is
> + * unix_ts_ms is a number of milliseconds since start of the UNIX epoch,
> + * ns_in_ms is a number of nanoseconds within millisecond. These values are
> * used for time-dependent bits of UUID.
>
> I think we can mention that the RFC describes that stored unix
> timestamp as an unsigned integer.
Done. Feel free to adjust my wordings, I've no sense of idiomatic English.
>
> ---
> static pg_uuid_t *
> -generate_uuidv7(int64 ns)
> +generate_uuidv7(uint64 unix_ts_ms, uint32 ns_in_ms)
>
> How about renaming ns_in_ms with sub_ms?
OK.
>
> ---
> + /* 64 bits is enough for real time, but not for a time range of UUID */
>
> I could not understand the point of this comment. It seems to say that
> 64-bits is not enough for a time range of UUID, but doesn't the time
> range of UUIDv7 use only 48 bits? It seems to need more comments.
I've tried to say that acquiring current time as an int64 ns since UNIX epoch is still viable for the code (until year 2262).
> ---
> - ns = (ts + (POSTGRES_EPOCH_JDATE - UNIX_EPOCH_JDATE) *
> SECS_PER_DAY * USECS_PER_SEC)
> - * NS_PER_US + ns % NS_PER_US;
> + us = (ts + (POSTGRES_EPOCH_JDATE - UNIX_EPOCH_JDATE) *
> SECS_PER_DAY * USECS_PER_SEC);
>
> /* Generate an UUIDv7 */
> - uuid = generate_uuidv7(ns);
> + uuid = generate_uuidv7(us / US_PER_MS, (us % US_PER_MS) *
> NS_PER_US + ns % NS_PER_US);
>
> Need to update comments in uuidv7_internval() such as:
>
> /*
> * Shift the current timestamp by the given interval. To calculate time
> * shift correctly, we convert the UNIX epoch to TimestampTz and use
> * timestamptz_pl_interval(). Since this calculation is done with
> * microsecond precision, we carry nanoseconds from original ns value to
> * shifted ns value.
> */
>
> and
>
> /*
> * Convert a TimestampTz value back to an UNIX epoch and back nanoseconds.
> */
I've tried. I'm not very satisfied with comments, but could not come up with easier description.
Thanks!
Best regards, Andrey Borodin.
Attachment | Content-Type | Size |
---|---|---|
v3-0001-UUDv7-fix-offset-computations-in-dates-after-2262.patch | application/octet-stream | 8.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2025-02-09 17:59:27 | Re: RFC: Allow EXPLAIN to Output Page Fault Information |
Previous Message | Alexander Korotkov | 2025-02-09 15:38:03 | Re: Replace IN VALUES with ANY in WHERE clauses during optimization |