Re: steps to ensure referential integrity

From: Robert Treat <xzilla(at)users(dot)sourceforge(dot)net>
To: aturner(at)neteconomist(dot)com
Cc: "scott(dot)marlowe" <scott(dot)marlowe(at)ihs(dot)com>, Jodi Kanter <jkanter(at)virginia(dot)edu>, Postgres Admin List <pgsql-admin(at)postgresql(dot)org>
Subject: Re: steps to ensure referential integrity
Date: 2003-09-24 14:11:44
Message-ID: 1064412704.30308.2028.camel@camel
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-admin

On Tue, 2003-09-23 at 13:41, aturner(at)neteconomist(dot)com wrote:
> On Tue, Sep 23, 2003 at 10:52:27AM -0600, scott.marlowe wrote:
> > On Mon, 22 Sep 2003, Jodi Kanter wrote:
> >
> > > I have inherited a database that was originally created in postgres
> > > version 6.? There were no steps taken at that time to implement
> > > constraints to ensure referential integrity. There are no FK
> > > constraints, PK constraints, etc..
> > > Most of what currently exists is done in the existing code. I would like
> > > to change this and am interested in getting suggestions. I expect that
> > > PK constraints on each table as well as all FK constraints should be in
> > > place. Anything else that I should consider?
> > > I have indexes currently set up on a variety of fields, including PK
> > > fields. Do the constraints automatically generate indexes on these
> > > fields such that I should remove the ones created at the end of my script?
> > > Thanks in advance for the suggestions!
> >
> > I would first upgrade to a reasonably new version of Postgesql 6.5.x is
> > pretty old. 7.3.4 or so should be a good choice. You'll find that you
> > can add the pk/fk constraints in the conversion process, and still have a
> > faster database with fk/pk under 7.3 than you'd have with no fk/pk
> > constraints in 6.5.3
> >
>
> Creating a primary key also creates a unique index on that field(s). Depending on how normalised the data schema is, you may also have to create stored procedures to gaurentee data integrity. If the data is not currently good, you may have to remove bad data to get the constraints to check. I'm no guru, but I do something like this to at least check:
>
> select count(distinct(id)) from table;
> vs
> select count(id) from table;
>

select count(distinct(id)), count(id), count(*) from table

> if the numbers are different, you have dup ids.
>
> To get a list:
> select id,count(*) from table group by (id);
>
> I hope someone can figure out how how to put a where on that so it only returns those above 1, but I personaly don't know how, so I'll use my friend the shell:
>

select id,count(*) from table group by id having count(*) > 1;

and actually just running that would allow you to skip the check from
above.

HTH

Robert Treat

> psql database -Uuser -c 'select id,count(*) from table gropu by id' | grep -v '1$';
>
> Alex Turner

--
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL

In response to

Browse pgsql-admin by date

  From Date Subject
Next Message scott.marlowe 2003-09-24 14:11:51 Re: [GENERAL] Fatal error: Allowed memory size of 8388608
Previous Message Tom Lane 2003-09-24 14:09:39 Re: vacuum failed - pgtoast not btree