From: | Merlin Moncure <mmoncure(at)gmail(dot)com> |
---|---|
To: | Andy Colson <andy(at)squeakycode(dot)net> |
Cc: | David Salisbury <salisbury(at)globe(dot)gov>, pgsql-general(at)postgresql(dot)org |
Subject: | Re: Strategy for Primary Key Generation When Populating Table |
Date: | 2012-02-09 22:30:40 |
Message-ID: | CAHyXU0xCdYNJ0LMm0wpTsJ3ni+4UxJ-n+82UPCCXq8MAGC93qA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
On Thu, Feb 9, 2012 at 4:20 PM, Andy Colson <andy(at)squeakycode(dot)net> wrote:
> On 2/9/2012 4:10 PM, David Salisbury wrote:
>>
>>
>>
>> On 2/9/12 10:08 AM, Rich Shepard wrote:
>>>
>>> I have reports containing macroinvertebrate collection data for several
>>> hundred (or several thousand) of taxa. There is no natural key since
>>> there
>>> are multiple rows for each site/date pair. Years ago Joe Celko taught
>>> me to
>>> seek natural keys whenever they might exist. They don't here. That's
>>> why I
>>> specifically mentioned that in my message.
>>
>>
>>
>> Interesting. I used to think natural keys were okay, but have since
>> decided
>> that surrogates are the way to go. That second layer of abstraction allows
>> for much easier data modifications when needed. What would be an example
>> of a natural key that would be good to use, and why would it be
>> preferable??
>>
>> I'd think the key value must never change, and even say kingdom values in
>> a
>> taxa table could possibly change.. might discover something new and do a
>> little reordering. :) Also natural keys might be strings, which I'm
>> thinking
>> would not be as efficient as integers for an index.
>>
>> -ds
>>
>
>
> Yeah, this is a Vim vs Emacs war. (Vim, :-) )
>
> I prefer surrogates like you. Its way to easy to pick something that one
> day has to change.
>
> Within the last year I remember a long thread about this same thing.
Sure, you can use surrogates, but you should still define or at least
be aware of a natural key if there is one. If you can't (which happens
on various type of data), then the surrogate is giving the illusion of
row uniqueness when there isn't one. This is really a design error:
other keys could depend on this table's primary key which is a
provably ambiguous relationship. Since your rows are not
informationally distinct from each other, why do you need to be able
to point at a specific one?
natural/surrogate is a performance/usability debate with various
tradeoffs. but using surrogate to 'create' uniqueness is a logical
design error; maybe a very forgivable one for various reasons, but the
point stands.
merlin
From | Date | Subject | |
---|---|---|---|
Next Message | Andy Colson | 2012-02-09 22:38:18 | Re: Strategy for Primary Key Generation When Populating Table |
Previous Message | Andy Colson | 2012-02-09 22:20:19 | Re: Strategy for Primary Key Generation When Populating Table |