Re: User defined types -- Social Security number...

From: "Tom Hebbron" <news_user(at)hebbron(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org(dot)pgsql-sql"(at)postgresql(dot)org
Subject: Re: User defined types -- Social Security number...
Date: 2004-03-03 01:20:43
Message-ID: c23bth$su1$1@news.hub.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-sql

"Greg Patnude" <gpatnude(at)hotmail(dot)com> wrote in message
news:c22ceg$s1a$1(at)news(dot)hub(dot)org(dot)(dot)(dot)
> "Michael Chaney" <mdchaney(at)michaelchaney(dot)com> wrote in message
> news:20040301211843(dot)GB19105(at)michaelchaney(dot)com(dot)(dot)(dot)
> > On Mon, Mar 01, 2004 at 09:42:48AM -0800, Steve Crawford wrote:
> > > I missed the start of this thread but will chime in with a comment
> > > anyway.
> > >
> > > My rule is to select an appropriate numeric type of data if you will
> > > be doing numeric types of things to it, character types if you will
> > > be doing character manipulations, etc.
> > >
> > > I don't know of any good reasons to need to know SSN/6.9, sqrt(SSN),
> > > SSN+7.86 but there are plenty of good reasons to need the first three
> > > characters (the "area number"), the middle two characters (the "group
> > > number", and the last 4 characters (the "serial number", often
> > > (ab)used as a password for banking and other purposes).
> >
> > Another excellent point. I often store zip codes as text for this
> > reason.
> >
> > The only other thing that I would mention is that if the SSN field in
> > the db will be a key of some sort, which is often the case, then it
> > might be more efficient to store it as an integer. It might be more
> > efficient to store it as a character string. The author should test
> > in this case to determine the most efficient way.
> >
> > As for character vs. integer manipulations, in most scripting style
> > languages, which is pretty much exlusively what I use, there's no
> > need to think about types, and something like an SSN will silently
> > change between being character or integer depending on what operations
> > are being performed on it.
> >
> > Michael
> > --
> > Michael Darrin Chaney
> > mdchaney(at)michaelchaney(dot)com
> > http://www.michaelchaney.com/
> >
> > ---------------------------(end of broadcast)---------------------------
> > TIP 9: the planner will ignore your desire to choose an index scan if
your
> > joining column's datatypes do not match
> >
>
> Ther are some other points I'd like to make --
>
> If I store the SSN as an integer -- theoretically -- leading zeroes will
be
> stripped (041-99-9999) -- my OWN ssn is a perfect example of this as it
> starts with a leading zero...
>
> This would cause a problem in that one of the requirements of an SSN is
that
> the length be exactly 9 digits or 9 chars WITHOUT the dashes.... so a
CHECK
> CONSTRAINT would be useful...
>
> But if the SSN is stored as an integer -- there is no check constraint
that
> wouldn't fail for SSNs that start with one or more zeroes.... So I thought
> how about a varchar(9) field and insert/update triggers that do the
> formatting (adding the dashes on insert/update --) and validate the check
> contraints (9 chars + the dashes)...
>
> The two extra characters making a varchar(11) field are not a concern in
the
> normalization or schema... I simply wanted a formatting function so that I
> dont have to do it in my scripting language or use the same CAST over and
> over and over in my select/insert/update statements....
>
> I am mainly looking to do the formatting automatically rather than having
to
> constantly format such a simple piece of data...
>
> It would be really sweet in postgreSQL if we could apply the equivalent of
a
> printf(columnname) to the table definition -- MS Access has what they call
> an "input mask" and it comes in really handy -- however -- I havent used
> Access for anthing serious for about 4 years...
>
>
> --
> Greg Patnude / The Digital Demention
> 2916 East Upper Hayden Lake Road
> Hayden Lake, ID 83835
> (208) 762-0762

You might want to look at CREATE DOMAIN

e.g. (for ISBNs, we want to check the format, and the check digit - replace
with suitable regex and validation function for social security numbers)

CREATE OR REPLACE FUNCTION utilities."validate_ISBN_check_digit"(char(10))
RETURNS boolean AS '
DECLARE
isbn_sum int:=0;
BEGIN
IF ($1 ~ ''^[0-9]{9}[0-9Xx]{1}$''::text) THEN
FOR i IN 1..10 LOOP
isbn_sum:= CASE
WHEN substring($1 from i for 1) IN (''X'',''x'') AND
i=10 THEN isbn_sum + (11-i * 10)
ELSE isbn_sum + (11-i * substring($1 from i for
1)::int)
END;
END LOOP;
IF mod(isbn_sum,11) = 0 THEN
RETURN ''t'';
END IF;
END IF;
RETURN ''f'';
END;
' LANGUAGE 'plpgsql';
COMMENT ON FUNCTION utilities."validate_ISBN_check_digit"(char(10)) is
'validation function for ISBN check digits';

CREATE DOMAIN utilities."ISBN" AS char(10) CONSTRAINT "ISBN format" CHECK
(VALUE ~ '^[0-9]{9}[0-9Xx]{1}$'::text) CONSTRAINT "ISBN checkdigit" CHECK
(utilities."validate_ISBN_check_digit"(VALUE));

Hope that helps

--
Tom Hebbron
www.hebbron.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message 帅猛 2004-03-03 02:41:17 docs on tsearch2
Previous Message Neil Conway 2004-03-03 00:01:11 Re: WAL Optimisation - configuration and usage

Browse pgsql-sql by date

  From Date Subject
Next Message Andreas Schmitz 2004-03-03 10:48:39 oracle decode()
Previous Message Chris Browne 2004-03-02 17:30:54 Re: User defined types -- Social Security number...