Re: Referential integrity (foreign keys) across multiple tables

From: Richard Huxton <dev(at)archonet(dot)com>
To: Richard Jones <rich(at)annexia(dot)org>
Cc: pgsql-sql(at)postgresql(dot)org
Subject: Re: Referential integrity (foreign keys) across multiple tables
Date: 2006-07-24 19:32:53
Message-ID: 44C52065.8070209@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-sql

Richard Jones wrote:
> Simplified schema:
>
> create table hosts (
> id serial primary key,
> hostname text not null
> );
>
> create table pages (
> id serial primary key,
> hostid int not null references hosts (id),
> url text not null,
> unique (hostid, url)
> );
>
> create table page_contents (
> pageid int not null references pages (id),
> section text not null
> );
>
> (There are many hosts, many pages per host, and many page_contents
> sections per page).
>
> Now I want to add a column to page_contents, say called link_name,
> which is going to reference the pages.url column for the particular
> host that this page belongs to.

Ah! It's only from reading later messages that I realise you're trying
to ensure that a link in page_contents can only reference pages on the
same host as itself.

> Something like:
>
> alter table page_contents add link_name text;
> alter table page_contents
> add constraint foo foreign key (p.hostid, link_name)
> references pages (hostid, url)
> where p.id = pageid;
>
> Obviously that second statement isn't going to compile.
>
> I don't want to add the hostid column to page_contents table because I
> have a lot of old code accessing the database which would be hard to
> change (the old code would no longer be able to insert page_contents
> rows).

I'm sure you know, but for the benefit of the list the problem here is
that the surrogate primary-key on "pages" has concealed valuable
information (i.e. the hostid) from the "page_contents" table.
Unfortunate that you can't update the application [* see rant below]

> Is this possible somehow? Perhaps by adding a second table? Do I
> have to use triggers, and if so is that as robust as referential
> integrity?

Write triggers to enforce the host dependency. Just make a table of all
the columns involved on all three tables and decide what (if anything)
should happen when a value is inserted/updated/deleted. Write your
triggers then write a short test script. Oh, test it with concurrent
transactions too - just to make sure it's doing what you'd expect.

The only reason hand-built triggers would be less reliable than built-in
foreign-keys is because they'll have had less testing.

The other alternative would be to add the hostid column to page_contents
and use a BEFORE trigger to set it based on the pageid the application
inserts. Then you can have a straightforward foreign-key. If your
application can't cope with having a column added to the table you might
have to mask this with a view.

[*minor rant follows:
Why is it that when changes to the database schema break applications
because the column-names are hard-wired, it is the RDBMS that is at
fault? I'm prepared to bet good money (up to the value of say, a whole
pound) that the primary key is just being used anonymously in the
application and that if the database interface had been able to refer to
it as "column(s) referencing pages primary-key" then Richard wouldn't
have a problem to post about.
And it's relational databases that aren't "agile" enough for the modern
world, not most of your programming languages.
]

--
Richard Huxton
Archonet Ltd

In response to

Browse pgsql-sql by date

  From Date Subject
Next Message Kevin Nikiforuk 2006-07-24 20:49:52 Re: CREATE TABLE AS inside of a function
Previous Message Richard Jones 2006-07-24 17:57:51 Re: Referential integrity (foreign keys) across multiple tables