From: | "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru> |
---|---|
To: | Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Nikolay Samokhvalov <samokhvalov(at)gmail(dot)com>, "Kyzer Davis (kydavis)" <kydavis(at)cisco(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Andrey Borodin <amborodin86(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, "brad(at)peabody(dot)io" <brad(at)peabody(dot)io>, "wolakk(at)gmail(dot)com" <wolakk(at)gmail(dot)com> |
Subject: | Re: UUID v7 |
Date: | 2023-07-07 12:06:19 |
Message-ID: | D8F6A3ED-9F70-4FBE-A259-AABFD89083C4@yandex-team.ru |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
> On 6 Jul 2023, at 21:38, Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com> wrote:
>
> I think it would be reasonable to review this patch now.
+1.
Also, I think we should discuss UUID v8. UUID version 8 provides an RFC-compatible format for experimental or vendor-specific use cases. Revision 1 of IETF draft contained interesting code for v8: almost similar to v7, but with fields for "node ID" and "rolling sequence number".
I think this is reasonable approach, thus I attach implementation of UUID v8 per [0]. But from my point of view this implementation has some flaws.
These two new fields "node ID" and "sequence" are there not for uniqueness, but rather for data locality.
But they are placed at the end, in bytes 14 and 15, after randomly generated numbers.
I think that "sequence" is there to help generate local ascending identifiers when the real time clock do not provide enough resolution. So "sequence" field must be placed after 6 bytes of time-generated identifier.
On a contrary "node ID" must differentiate identifiers generated on different nodes. So it makes sense to place "node ID" before timing. So identifiers generated on different nodes will tend to be in different ranges.
Although, section "6.4. Distributed UUID Generation" states that "node ID" is there to decrease the likelihood of a collision. So my intuition might be wrong here.
Do we want to provide this "vendor-specific" UUID with tweaks for databases? Or should we limit the scope with well defined UUID v7?
Best regards, Andrey Borodin.
[0] https://datatracker.ietf.org/doc/html/draft-ietf-uuidrev-rfc4122bis-01
Attachment | Content-Type | Size |
---|---|---|
v2-0001-Implement-UUID-v7-and-v8-as-per-IETF-draft.patch | application/octet-stream | 7.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Daniel Gustafsson | 2023-07-07 12:09:08 | Re: DROP DATABASE is interruptible |
Previous Message | Amit Langote | 2023-07-07 11:59:32 | Re: remaining sql/json patches |