From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | "Andrey M(dot) Borodin" <x4mmm(at)yandex-team(dot)ru> |
Cc: | Sergey Prokhorenko <sergeyprokhorenko(at)yahoo(dot)com(dot)au>, Jelte Fennema-Nio <postgres(at)jeltef(dot)nl>, Michael Paquier <michael(at)paquier(dot)xyz>, Aleksander Alekseev <aleksander(at)timescale(dot)com>, pgsql-hackers mailing list <pgsql-hackers(at)postgresql(dot)org>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Przemysław Sztoch <przemyslaw(at)sztoch(dot)pl>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>, Mat Arye <mat(at)timescaledb(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Nikolay Samokhvalov <samokhvalov(at)gmail(dot)com>, Junwang Zhao <zhjwpku(at)gmail(dot)com>, Stepan Neretin <sncfmgg(at)gmail(dot)com> |
Subject: | Re: UUID v7 |
Date: | 2024-11-07 07:42:05 |
Message-ID: | CAD21AoBVxi5hZJnoyN-PKhd5UxCFtooRiDpsTDs3Eg=rROgsBA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Nov 6, 2024 at 10:14 AM Andrey M. Borodin <x4mmm(at)yandex-team(dot)ru> wrote:
>
>
>
> > On 5 Nov 2024, at 23:56, Andrey M. Borodin <x4mmm(at)yandex-team(dot)ru> wrote:
> >
> > <v30-0001-Implement-UUID-v7.patch>
>
> Some more thoughts on this patch version:
>
> 0. Comment mentioning nanoseconds, while we do not need to carry anything
> /* Convert TimestampTz back and carry nanoseconds. */
>
> 1. There's unnecessary &3 in
> uuid->data[7] = uuid->data[7] | ((uuid->data[8] >> 6) & 3);
>
> 2. Currently we store 0..999 microseconds in 10 bits, so values 1000..1023 are unused. We could use them for overflow. That would slightly increase non-overflowing capacity when generating more than million UUIDs per second on one backend. However, given current performance of our CSPRNG I do not think this feature worth code complexity.
>
While using only 10 bits microseconds makes the implementation simple,
I'm not sure if 10 bits is enough to generate UUIDs at microsecond
granularity without losing monotonicity. Since 10-bit microseconds are
used as is in rand_a space, 1000 UUIDs can be generated per
millisecond without losing monotonicity.
For example, in my environment, it took 1808 milliseconds to generate
1 million UUIDs. This is about 533 UUIDs generated per millisecond. As
UUID generation performance improves, I think 10 bits will not be
enough.
=# select count(uuidv7()) from generate_series(1, 1_000_000);
count
---------
1000000
(1 row)
Time: 1808.734 ms
I found a similar comment from Sergey Prokhorenko[1]. He also mentioned:
> 4) Microsecond timestamp fraction subtracts 10 bits from random data, which increases the risk of collision. In the counter, almost all bits are initialized with a random number, which reduces the risk of collision.
I feel that it's better to switch to Method 1 or 2 with 12 bits or
larger counter space.
Regards,
[1] https://www.postgresql.org/message-id/305478845.5279532.1712440778735%40mail.yahoo.com
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Andy Fan | 2024-11-07 08:05:01 | Re: Deleting older versions in unique indexes to avoid page splits |
Previous Message | jian he | 2024-11-07 07:10:33 | Re: not null constraints, again |