Re: partial "on-delete set null" constraint

From: Alban Hertroys <haramrae(at)gmail(dot)com>
To: Rafal Pietrak <rafal(at)ztk-rp(dot)eu>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: partial "on-delete set null" constraint
Date: 2015-01-03 15:48:27
Message-ID: 862E440D-37A6-4BF4-B1A1-6D5F46FC5624@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general


> On 03 Jan 2015, at 15:20, Rafal Pietrak <rafal(at)ztk-rp(dot)eu> wrote:
>
> W dniu 03.01.2015 o 14:11, Alban Hertroys pisze:
> [------------------]
>> You assumed a functional dependency between username and domain, while those fields actually describe independent entities that don’t necessarily go together as you found out. Hence you need to normalise further.
>>
>> For example:
>>
>> CREATE TABLE maildomains (domain text primary key, profile text not null);
>> CREATE TABLE mailusers (username text primary key);
>> CREATE TABLE maildomainusers (username text references mailusers(username), domain text references maildomains(domain), primary key (username, domain));
>> CREATE TABLE mailboxes (username text references mailusers(username) on update cascade on delete set null, domain text not null references maildomains(domain) on update cascade, mailmessage text not null);
>
> I don't think that this tableset actually describe "an ordinary mailhub", which I'm coding.

An “ordinary mail hub” is rather subject to interpretation, so that depends on your definition of it. As I understand it, your “mail hub” collects mails from several domains for various users? I’m not really sure about the benefits of such an application, unless internet connections to the domains you’re playing hub for are really flaky - but that’s just a guess.

> the "on delete set null" within mailboxes(username) act only on delete executed at mailusers; while the delete in question will be executed on maildomainusers.

It was but an example I cooked up quickly from the info you provided. Yeah, you would have to set the username reference to NULL by hand if you’d delete maildomainusers. That could easily be done using a trigger on maildomainusers, though.

> In particular "postmaster", as a single entity in mailusers table, will have as many entries in maildomainusers as there are domains in maildomains. But some domains may live without a postmaster user ... or a postmaster user may be replaced by an alias (another table, not presented for clearity). in such case, postmaster user will be dropped from maildomainusers, but will remain in mailusers table for other domains to reference. And delete of that postmaster user from maildomainuser will not fireback into the mailboxes to set null postmaster username from mails within that domain.

That description makes your problem a lot easier to envision.

> Pity. So I must look for some sort of trigger functions .... as I've already started, but nothing came up functioning as I'd need it to.
>
>>
>>> Would it violate SQL standard (signifficantly), if an "on delete set null" action just ignored all the FK columns that have a "NOT NULL" constraint set?
>> Yes. You would end up with a non-unique reference to the foreign table, as the tuple (domain, NULL) could reference _any_ mailuser in a domain: NULL means ‘unknown’, any username might match that.
>
> Yes. This is precisely the "semantics" I'm trying to put into the schema: after a username is "released" from service, all it's messages become "from unknown user".... unless thoroughly investigated :)

It also makes a foreign key reference unusable: There is no unique parent record to match it to, so what exactly are you referencing?

Besides, with the schema you gave, “unless thoroughly investigated” is not going to help much to find the user; that information is no longer present unless you also store it elsewhere (for example inside your mailbox message data).

>>
>> As I understand it, this is precisely why Boyce-relationality forbids NULLs in primary keys, although I’m not so sure he’s right about that.
>>
>
> Having only slight theoretical background, I'd say: it could be "partially" the reason. I think, that "primary key" is just a syntactic shortcut for "unique AND not null" - so often used, that the shortcut is so appreciated. But "just unique", meaning unique just for values that "happen to be known" is also usefull, and thus it is allowed on equal bases.... only for other usage scenarios.

I’m in the middle of (finally) receiving that theoretical background, so I know where you come from. I’m also in the fortunate position to have all that theoretical jargon at the ready ;)

Until recently I used to think the same way about NULLs in PK's, and it holds true when you only look at the PK.
However, once you add foreign key references to a table with such a PK, things change. FK’s are supposed to reference a single unique entity in a parent table, but when there are NULLs in the mix, that becomes impossible.

Cheers,

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message John Casey 2015-01-03 15:52:09 Re: bdr_init_copy fails when starting 2nd BDR node
Previous Message Edson Carlos Ericksson Richter 2015-01-03 15:28:19 Re: pg_base_backup limit bandwidth possible?