Re: Using PK value as a String

From: "Jay D(dot) Kang" <arrival123(at)gmail(dot)com>
To: "Mario Weilguni" <mweilguni(at)sime(dot)com>
Cc: valiouk(at)yahoo(dot)co(dot)uk, "Gregory Stark" <stark(at)enterprisedb(dot)com>, pgsql-performance(at)postgresql(dot)org
Subject: Re: Using PK value as a String
Date: 2008-08-12 10:18:26
Message-ID: d251ee4a0808120318k1fe2cc7fjf926dd99624248bf@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

You guys totally rock!

I guess, bottom line, we should take that extra day to convert our PK and FK
to a numerical value, using BIG INT to be on the save side. (Even though
Wikipedia's UserID uses just an integer as data type)

To Gregory: Thank you for you valuable statement.
"But the real question here is what's the better design. If you use Username
you'll be cursing if you ever want to provide a facility to allow people to
change their usernames. You may not want such a facility now but one day" I
think you hit the nail on the head with this comment. If a member really
wants to change their username, IE: Choose to go with IloveUSara, only to be
dumped on the alter, who am I to say no.

To Valentin: I wish someone would prove us both wrong or right. I still
thought it wasn't a bad idea to use username a varchar(256) to interact with
all the modules... Well thats what I thought when I first started writing
the tables...

To Jay: Thanks for keeping it short and simple. "I'd like sequence, which is
unique and just use 8 bytes(bigint) When it querying other tables, it will
faster , and disk space smaller than UUID(40 bytes)." I'm taking your advice
on this^^ Although wikipedia's postgresql database schema still stands.

To Craig: Yes, I agree. Please see my comment on IloveUSara.

To Mario: Let's go! I'm Mario... Sorry, I love Mario Kart. Especially on the
old super famacon. Going with int8, thank you for the advice.

On Tue, Aug 12, 2008 at 6:58 PM, Mario Weilguni <mweilguni(at)sime(dot)com> wrote:

> Valentin Bogdanov schrieb:
>
> --- On Mon, 11/8/08, Gregory Stark <stark(at)enterprisedb(dot)com> wrote:
>>
>>
>>
>>> From: Gregory Stark <stark(at)enterprisedb(dot)com>
>>> Subject: Re: [PERFORM] Using PK value as a String
>>> To: "Jay" <arrival123(at)gmail(dot)com>
>>> Cc: pgsql-performance(at)postgresql(dot)org
>>> Date: Monday, 11 August, 2008, 10:30 AM
>>> "Jay" <arrival123(at)gmail(dot)com> writes:
>>>
>>>
>>>
>>>> I have a table named table_Users:
>>>>
>>>> CREATE TABLE table_Users (
>>>> UserID character(40) NOT NULL default
>>>>
>>>>
>>> '',
>>>
>>>
>>>> Username varchar(256) NOT NULL default
>>>>
>>>>
>>> '',
>>>
>>>
>>>> Email varchar(256) NOT NULL default
>>>>
>>>>
>>> ''
>>>
>>>
>>>> etc...
>>>> );
>>>>
>>>> The UserID is a character(40) and is generated using
>>>>
>>>>
>>> UUID function. We
>>>
>>>
>>>> started making making other tables and ended up not
>>>>
>>>>
>>> really using
>>>
>>>
>>>> UserID, but instead using Username as the unique
>>>>
>>>>
>>> identifier for the
>>>
>>>
>>>> other tables. Now, we pass and insert the Username to
>>>>
>>>>
>>> for discussions,
>>>
>>>
>>>> wikis, etc, for all the modules we have developed. I
>>>>
>>>>
>>> was wondering if
>>>
>>>
>>>> it would be a performance improvement to use the 40
>>>>
>>>>
>>> Character UserID
>>>
>>>
>>>> instead of Username when querying the other tables, or
>>>>
>>>>
>>> if we should
>>>
>>>
>>>> change the UserID to a serial value and use that to
>>>>
>>>>
>>> query the other
>>>
>>>
>>>> tables. Or just keep the way things are because it
>>>>
>>>>
>>> doesn't really make
>>>
>>>
>>>> much a difference.
>>>>
>>>>
>>> Username would not be any slower than UserID unless you
>>> have a lot of
>>> usernames longer than 40 characters.
>>>
>>> However making UserID an integer would be quite a bit more
>>> efficient. It would
>>> take 4 bytes instead of as the length of the Username which
>>> adds up when it's
>>> in all your other tables... Also internationalized text
>>> collations are quite a
>>> bit more expensive than a simple integer comparison.
>>>
>>> But the real question here is what's the better design.
>>> If you use Username
>>> you'll be cursing if you ever want to provide a
>>> facility to allow people to
>>> change their usernames. You may not want such a facility
>>> now but one day...
>>>
>>>
>>>
>>
>> I don't understand Gregory's suggestion about the design. I thought using
>> natural primary keys as opposed to surrogate ones is a better design
>> strategy, even when it comes to performance considerations and even more so
>> if there are complex relationships within the database.
>>
>> Regards,
>> Valentin
>>
>>
>>
> UUID is already a surrogate key not a natural key, in no aspect better than
> a numeric key, just taking a lot more space.
>
> So why not use int4/int8?
>
>
>
>

--
Regards,
Jay Kang

This e-mail is intended only for the proper person to whom it is addressed
and may contain legally privileged and/or confidential information. If you
received this communication erroneously, please notify me by reply e-mail,
delete this e-mail and all your copies of this e-mail and do not review,
disseminate, redistribute, make other use of, rely upon, or copy this
communication. Thank you.

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message Gregory Stark 2008-08-12 12:51:36 Re: Using PK value as a String
Previous Message Mario Weilguni 2008-08-12 09:58:39 Re: Using PK value as a String