Re: UUID v7

From: Sergey Prokhorenko <sergeyprokhorenko(at)yahoo(dot)com(dot)au>
To: Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>, "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, pgsql-hackers mailing list <pgsql-hackers(at)postgresql(dot)org>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Przemysław Sztoch <przemyslaw(at)sztoch(dot)pl>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Mat Arye <mat(at)timescaledb(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Nikolay Samokhvalov <samokhvalov(at)gmail(dot)com>, Junwang Zhao <zhjwpku(at)gmail(dot)com>
Subject: Re: UUID v7
Date: 2024-07-23 23:09:48
Message-ID: 1012137874.340418.1721776188406@mail.yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Dear Colleagues,

Althoughthe uuidv7(timestamp) function clearly contradicts RFC 9562, but theuuidv7(timestamp_offset) function is fully compliant with RFC 9562 and isabsolutely necessary.
Here is a quote from the RFC 9562to support thisstatement (RFC 9562: Universally Unique IDentifiers (UUIDs)):

|
|
|
| | |

|

|
|
| |
RFC 9562: Universally Unique IDentifiers (UUIDs)

This specification defines UUIDs (Universally Unique IDentifiers) -- also known as GUIDs (Globally Unique IDenti...
|

|

|

"Altering,Fuzzing, or Smearing:

ImplementationsMAY alter the actual timestamp. Some examples include security considerationsaround providing a real-clock value within a UUID to 1) correct inaccurateclocks, 2) handle leap seconds, or 3) obtain a millisecond value by dividing by1024 (or some other value) for performance reasons (instead of dividing anumber of microseconds by 1000). This specification makes no requirement orguarantee about how close the clock value needs to be to the actual time. "

It’s written clumsily, of course, butthe intention of the authors of RFC 9562 is completely clear: the currenttimestamp can be changed by any amount and for any reason, including securityor performance reasons. The wording provides only a few examples, the list ofwhich is certainly not exhaustive.

The motives of the authors of RFC 9562are also clear. The timestamp is needed only to generate monotonicallyincreasing UUIDv7.The timestamp should not be used as a source of data about the time the recordwas created (this is explicitly stated in section 6.12. Opacity). Therefore,the actual timestampcan and should be changed if necessary.

Why then does RFC 9562 contain wording aboutthe need to use "Unix Epoch timestamp"? First, the authors of RFC9562 wanted toget away from using the Gregorian calendar, which required a timestamp that wastoo long. Second, the RFC 9562 prohibits inserting into UUIDv7 a completely arbitrary dateand time value that does not increase with the passage of real time. And thisis correct, since in this case the generated UUIDv7 would not be monotonicallyincreasing. Thirdly, on almost all computing platforms there is a convenientsource of "Unix Epoch timestamp".

Whydoes the uuidv7() function need the optional formal parameter timestamp_offset?This question isbest answered by a quote from https://lu.sagebl.eu/notes/maybe-we-dont-need-uuidv7 :

"Leakinginformation

UUIDv4does not leak information assuming a proper implementation. But, UUIDv7 in factdoes: the timestamp of the server is embeded into the ID. From a business pointof view it discloses information about resource creation time. It may not be aproblem depending on the context. Current RFC draft allows implementation totweak timestamps a little to enforce a strict increasing order between twogenerations and to alleviate some security concerns."

There is a lot of hate on the internetabout "UUIDv7 should not be used because it discloses the date and time the record wascreated." If there was a ban on changing the actual timestamp, this wouldprevent the use of UUIDv7 in mission-critical databases, and would generallylead to a decrease in the popularity of UUIDv7.

The implementation details of timestamp_offsetare, of course, up to the developer. But I would suggest two features:

1. Ifthe result of applyingtimestamp_offsetthe timestamp goes beyond the permissible interval, the timestamp_offset value mustbe reset tozero
2. Thedata type for timestamp_offsetshould bedeveloper-friendly interval type,(https://postgrespro.ru/docs/postgresql/16/datatype-datetime?lang=en#DATATYPE-INTERVAL-INPUT), which allows you to enter the argument value using words microsecond,millisecond, second, minute, hour, day, week, month, year, decade, century,millennium.
Ireally hope that timestamp_offsetwill be used inthe uuidv7() function for PostgreSQL.

Sergey Prokhorenkosergeyprokhorenko(at)yahoo(dot)com(dot)au


In response to

  • Re: UUID v7 at 2024-07-20 11:46:23 from Andrey M. Borodin

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2024-07-23 23:37:31 Re: Direct SSL connection and ALPN loose ends
Previous Message Masahiko Sawada 2024-07-23 22:59:28 Re: xid_wraparound tests intermittent failure.