Re: Using varchar primary keys.

From: Joe Van Dyk <joe(at)tanga(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Tim Uckun <timuckun(at)gmail(dot)com>, jesusthefrog <jesusthefrog(at)gmail(dot)com>, Jasen Betts <jasen(at)xnet(dot)co(dot)nz>, pgsql-general <pgsql-general(at)postgresql(dot)org>
Subject: Re: Using varchar primary keys.
Date: 2013-04-02 20:53:56
Message-ID: CACfv+p+OBgT1DQ1CTP82e-jsBBSZy8XWME0k5NHwcAoog7_pgg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, Apr 2, 2013 at 11:16 AM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:

> On Tue, Apr 2, 2013 at 10:34 AM, Joe Van Dyk <joe(at)tanga(dot)com> wrote:
> > On Mon, Apr 1, 2013 at 1:11 PM, Tim Uckun <timuckun(at)gmail(dot)com> wrote:
> >>
> >>
> >>
> >>
> >> On Tue, Apr 2, 2013 at 8:35 AM, jesusthefrog <jesusthefrog(at)gmail(dot)com>
> >> wrote:
> >>>
> >>> On the topic of 'natural' versus 'synthetic' primary keys, I am
> generally
> >>> in the camp that an extra ID field won't cost you too much, and while
> one
> >>> may not need it for a simple table (i.e. id, name) one might add any
> number
> >>> of columns later, and you'll be glad to have it.
> >>>
> >>
> >> Nothing prevents you from adding more columns if you use varchar primary
> >> keys.
> >>
> >>>
> >>>
> >>> My preferred method is to give every table an ID column of UUID type
> and
> >>> generate a UUID using the uuid-ossp contrib module. This also prevents
> >>> someone not familiar with the database design from using an ID
> somewhere
> >>> they should not (as is possible with natural PKs) or treating the ID
> as an
> >>> integer, not an identifier (as is all too common with serial integers).
> >>>
> >>>
> >>
> >> This would be a concern if you had multi master writes . As far as I
> know
> >> Postgres does not have a true multi master replication system so all the
> >> writes have to happen on one server right?
> >>
> >> As for UUIDs I use them sometimes but I tend to also use one serial
> column
> >> because when I am communicating with people it makes it so much easier
> to
> >> say "dealer number X" than "dealer number SOME_HUGE_UUID". I often
> have to
> >> talk to people about the data and UUIDs make it very difficult to
> >> communicate with humans.
> >
> >
> > I've been wishing for a smaller uuid type for a while. Say you need to
> > assign a Order #. Customers might be referencing the number, so you don't
> > want it to be too long. But you don't want Order #'s to be guessable or
> have
> > the characters/digits be transposed accidently.
> >
> > I've been using a unique text column with a default of
> random_characters(12)
> >
> > CREATE OR REPLACE FUNCTION public.random_characters(length integer)
> > RETURNS text
> > LANGUAGE sql
> > STABLE
> > AS $function$
> > SELECT array_to_string(array((
> > SELECT SUBSTRING('abcdefghjkmnpqrstuvwxyz23456789'
> > FROM mod((random()*31)::int, 31)+1 FOR 1)
> > FROM generate_series(1, $1))),'');
> > $function$;
> >
> > This seems to work ok. I don't allow 1's or l's or 0's or O's, as they
> can
> > easily be mistyped or misread.
>
> This is pseudo random and can be guessed, which is maybe dangerous
> depending on circumstance. For stronger random stream go to
> pgcrypto.gen_random_bytes(). Also, now you have to worry about
> collisions -- the whole point of uuid is to try and keep you from
> having to deal with that.
>

Right, but it's better than using serial's as far as being guessable.

The probability for collisions are fairly low, if you are using 12 or more
characters (with 30 possible characters). Not sure what the math is on the
probability of collisions (birthday problem) though.. and you could have a
trigger that checked for the existence of a matching key before
inserts/updates.

And using UUIDs would be too long for lots of purposes where people are
working with the numbers, and where there might be external constraints on
how long the IDs can be.

An example use case:
https://www.tanga.com/deals/cd8f90c81a/oral-b-sensitive-clean-6-extra-soft-replacement-brush-heads

where "cd8f90c81a" is the 'uuid' for that product. It's non-guessable
enough, and I don't want to put a full UUID in the URL.

>
> My historical comments in this debate are noted. To summarize, I
> strongly believe that natural keys are often (but not always) better.
>
>

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Kevin Grittner 2013-04-02 21:17:23 Re: in C trigger function find out if column is part of primary key
Previous Message Jerry Sievers 2013-04-02 19:45:10 Re: create temporary table problem