Re: Avoiding surrogate keys

From: Thom Brown <thombrown(at)gmail(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: PGSQL Mailing List <pgsql-general(at)postgresql(dot)org>
Subject: Re: Avoiding surrogate keys
Date: 2010-04-21 19:30:44
Message-ID: z2gbddc86151004211230za92d0d4bjc877f202e3cd5ee0@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On 21 April 2010 20:18, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:

> On Wed, Apr 21, 2010 at 3:01 PM, Thom Brown <thombrown(at)gmail(dot)com> wrote:
> > I think I know what I plan to do, but want to throw this out there to see
> if
> > there are differing points of view.
> > I have a mailing list table, and 2 of the columns contain values which
> have
> > to be from a list. These are country and status. There are 237 possible
> > countries and 3 possible statuses. Now I know some people would assign a
> > sequence ID (surrogate key) to the country and status values, and have
> them
> > looked up in separate tables to get the textual value, but I think I'll
> > still have those tables, just without an ID column, so 1 column for both
> the
> > countries and statuses tables. This means storing the proper value in
> the
> > main table.
> > So instead of
> > name, email, country, status
> > 'mr smith', 'emailaddress(at)example(dot)com', 44, 2
> > 'mrs jones', 'me(at)emailcompany(dot)com', 21, 1
> > I'd have
> > name, email, country, status
> > 'mr smith', 'emailaddress(at)example(dot)com', 'China', 'Registered'
> > 'mrs jones', 'me(at)emailcompany(dot)com', 'Belgium', 'Unconfirmed'
> > The values of course would be constrained by foreign key lookup to their
> > associated tables.
> > Are there any serious downsides to this? If so, what would you
> recommend?
>
> Natural keys:
> *) force formal relationships into your key design (this is good)
> *) Make your database MUCH easier to follow, browse, and understand
> *) in particular cases allow you to skip joins
> *) will make your indexes fatter (this is not good)
> *) can be a pain if your keys are updated frequently
> *) can be a major pain if your key changes in structure (adds a field,
> or changes in type)
>
> Surrogate keys:
> *) Give you faster joins, but more of them (this is a win/loss
> depending on circumstances)
> *) Tend to encourage lazy/poor designs, since you hide relationships
> behind a value
> *) Make the tables more difficult to browse and understand
> *) Make updates to keys/key structure trivial
>
> I personally use natural keys when I can and surrogates when I have
> to. When I do use a surrogate, I tend to still define the natural key
> as primary and simply make a alternate 'unique' constraint for the
> surrogate.
>
> merlin
>

Thanks for the comments guys. I'm now pretty sure using natural keys is the
right approach. Neither list will ever change type, and if they change,
it'll be quite infrequent. Index size shouldn't really be a problem since
we're realistically talking thousands of rows.

What I hate about surrogate keys is having to keep looking stuff up:

INSERT INTO stuff (col_a, col_b)
SELECT 'my_value', table_b.id
FROM table_b
WHERE table_b.real_value = 'PostgreSQL';

As opposed to:

INSERT INTO stuff (col_a, col_b) VALUES ('my_value', 'PostgreSQL');

Just making sure I don't fall into the surrogate key abuse trap. :)

Thom

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Bill Moran 2010-04-21 19:32:26 Re: Avoiding surrogate keys
Previous Message Joshua D. Drake 2010-04-21 19:25:16 Re: Avoiding surrogate keys