Re: Using PK value as a String

From: Mark Mielke <mark(at)mark(dot)mielke(dot)cc>
To: Bill Moran <wmoran(at)collaborativefusion(dot)com>
Cc: Gregory Stark <stark(at)enterprisedb(dot)com>, Mario Weilguni <mweilguni(at)sime(dot)com>, valiouk(at)yahoo(dot)co(dot)uk, Jay <arrival123(at)gmail(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Using PK value as a String
Date: 2008-08-12 13:21:40
Message-ID: 48A18E64.10707@mark.mielke.cc
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Bill Moran wrote:
>> The main reason to use UUID instead of sequences is if you want to be able to
>> generate unique values across multiple systems. So, for example, if you want
>> to be able to send these userids to another system which is taking
>> registrations from lots of places. Of course that only works if that other
>> system is already using UUIDs and you're all using good generators.
>>
>
> Note that in many circumstances, there are other options than UUIDs. If
> you have control over all the systems generating values, you can prefix
> each generated value with a system ID (i.e. make the high 8 bits the
> system ID and the remaining bits come from a sequence) This allows
> you to still use int4 or int8.
>
> UUID is designed to be a universal solution. But universal solutions
> are frequently less efficient than custom-tailored solutions.
>

Other benefits include:
- Reduced management cost. As described above, one would have to
allocate keyspace in each system. By using a UUID, one can skip this step.
- Increased keyspace. Even if keyspace allocation is performed, an
int4 only has 32-bit of keyspace to allocate. The IPv4 address space is
already over 85% allocated as an example of how this can happen.
128-bits has a LOT more keyspace than 32-bits or 64-bits.
- Reduced sequence predictability. Certain forms of exploits when
the surrogate key is exposed to the public, are rendered ineffective as
guessing the "next" or "previous" generated key is far more difficult.
- Used as the key into a cache or other lookup table. Multiple types
of records can be cached to the same storage as the sequence is intended
to be universally unique.
- Flexibility to merge systems later, even if unplanned. For
example, System A and System B are run independently for some time.
Then, it is determined that they should be merged. If unique keys are
specific to the system, this becomes far more difficult to implement
than if the unique keys are universal.

That said, most uses of UUID do not require any of the above. It's a
"just in case" measure, that suffers the performance cost, "just in case."

Cheers,
mark

--
Mark Mielke <mark(at)mielke(dot)cc>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Gregory Stark 2008-08-12 13:46:57 Re: Using PK value as a String
Previous Message Bill Moran 2008-08-12 13:06:21 Re: Using PK value as a String