Re: cross-table constraints?

From: Michael Swierczek <mike(dot)swierczek(at)gmail(dot)com>
To: Kevin Hunter Kesling <kmhunte2(at)ncsu(dot)edu>
Cc: Postgres Novice List <pgsql-novice(at)postgresql(dot)org>
Subject: Re: cross-table constraints?
Date: 2013-08-27 19:05:53
Message-ID: CAHp1f1ObVSwn25==cLV0i0AEMaUQQ3tucm2Oz2M7-xU8sPyW0g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

( See response below. )

On Tue, Aug 27, 2013 at 2:41 PM, Kevin Hunter Kesling <kmhunte2(at)ncsu(dot)edu> wrote:
> Hello Postgres Novice Helpers,
>
> I don't know if this is the correct place for this question, but please
> redirect as appropriate.
>
> I am trying to create a datastore for a mathematical optimization model I've
> built. The math model data structure has a lot of interdependencies and
> data integrity has been a constant battle over the past year. I've been
> made aware that data integrity is the I in ACID, and is exactly the reason
> RDBMSs were created. Great!
>
> I've spent some time putting together a schema that appears to make sense,
> taking care of such concepts as "single point of authority" and
> "normalization". I've tried to normalize it as best I could, but I don't
> exactly have a lot of practice designing schemas. I'm now running into a
> situation where I can't create a constraint that I think I need. My
> research suggests that the type of constraint I want is colloquially known
> as a "cross-table constraint", and implies that I have not properly
> normalized or otherwise structured my schema. So, I would highly appreciate
> any pointers this list could give me.
>
> For the sake of the schema below, let me briefly describe what the model
> needs and does.
>
> The model is an optimization model, so at it's core finds a maximum or
> minimum value of an "objective function", subject to any number of
> constraints. For example, a typical objective function would be "Find the
> minimum cost to supply energy to this system."
>
> The model is organized into analyses, and each analysis can have a set of
> processes. I define 'process' as a technology-vintage tuple <t, v>. For
> example, a technology 'e_coal' might describe an electric coal power plant,
> and <e_coal, 1980> would describe a coal power plant built in 1980.
>
> We might define the amount of capacity installed as:
>
> CAP[e_coal, 1980] = 100 # some number of units
>
> The analysis also has the notion of a first period. For example, if the
> analysis in question had these vintages:
>
> 1980, 1990, 2000, 2010, 2020, 2030, 2040
>
> And period_0 (the first period), was 2010, then the model only makes
> decisions about the periods 2010, 2020, and 2030.
>
> This means that to /utilize/ the <e_coal, 1980> process, we might define
> some "activity" variables for it:
>
> ACT[2010, e_coal, 1980]
> ACT[2020, e_coal, 1980]
> ACT[2030, e_coal, 1980]
> ACT[2040, e_coal, 1980]
>
> The ACTivity variable is indexed by period, technology, and vintage <p,t,v>.
> *Note that the set of periods (e.g., 2010, 2020, 2030, 2040) is a subset of
> the set of vintages.*
>
> Here's where I get to how this relates to my question: as the model can only
> make decisions with variables that describe future time periods (e.g.,
> ACT[2010,...], ACT[2020,...], ACT[2030,...], ...), it makes no sense to
> define any characteristics for periods prior to 2010. For example, if the
> use of the <e_coal, 1980> cost different amounts in different periods, I
> might have data like this:
>
> Cost[2010, e_coal, 1980] = 0.10 # $/kWh
> Cost[2020, e_coal, 1980] = 0.30 # $/kWh
> Cost[2030, e_coal, 1980] = 0.60 # $/kWh
> Cost[2040, e_coal, 1980] = 0.33 # $/kWh
>
> However, what's currently to prevent a user from giving data for 2000, or
> 1990? At the moment, nothing other than my application logic. For example,
> consider this (minimized) schema:
>
> CREATE TABLE analysis (
> id INTEGER NOT NULL PRIMARY KEY,
> name VARCHAR(32767) NOT NULL UNIQUE,
> period_0 INTEGER NOT NULL
> );
>
> CREATE TABLE vintage (
> id INTEGER NOT NULL PRIMARY KEY,
> analysis_id INTEGER NOT NULL REFERENCES analysis (id),
> vintage INTEGER NOT NULL
> );
>
> CREATE TABLE technology (
> id integer NOT NULL PRIMARY KEY,
> name varchar(32767) NOT NULL UNIQUE
> );
>
> CREATE TABLE process (
> id INTEGER NOT NULL PRIMARY KEY,
> analysis_id INTEGER NOT NULL REFERENCES analysis (id),
> technology_id INTEGER NOT NULL REFERENCES technology (id),
> vintage_id INTEGER NOT NULL REFERENCES vintage (id),
>
> UNIQUE (analysis_id, technology_id, vintage_id)
> );
>
> CREATE TABLE cost (
> id integer NOT NULL PRIMARY KEY,
> period_id integer NOT NULL REFERENCES vintage (id),
> process_id integer NOT NULL REFERENCES process (id),
> value real NOT NULL,
>
> UNIQUE (period_id, process_id)
> );
>
> Ideally, I'd have a constraint on the 'cost' table that required that the
> period referenced by period_id was not less than the associated analysis
> period_0. I would like any INSERT statement that mimicks this printf format
> string would fail:
>
> INSERT INTO cost (period_id, process_id, value)
> VALUES (%d, %d, %f);
>
> (where, period_id maps to a vintage.vintage that is less than
> analysis.period_0)
>
> What would be a good way to reorganize my schema to enable a constraint that
> effectively limits the allowable rows of cost, subject to the value?
>
> Many thanks (and apologies for the long post!),
>
> Kevin
>
>
> --
> Sent via pgsql-novice mailing list (pgsql-novice(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-novice

Kevin,

I am not a database expert, I don't see a way to enforce what you want
directly in the uniqueness and foreign key constraints of the schema.
But I believe you should be able to enforce the constraint with a
trigger. http://www.postgresql.org/docs/current/static/plpgsql-trigger.html

I may have misunderstood your example, but I think this _might_ do
what you want:

CREATE FUNCTION cost_period_check() RETURNS trigger AS $cost_period_check$
DECLARE
a_row analysis%ROWTYPE;
BEGIN
SELECT INTO a_row analysis.* FROM analysis
INNER JOIN process ON process.analysis_id = analysis.id
WHERE process.id = NEW.process_id LIMIT 1;
IF a_row.period_0 > NEW.period_id THEN
RAISE EXCEPTION '% is before analysis period_0 %',
NEW.period_id, a_row.period_0;
END IF;
RETURN NEW;
END;
$cost_period_check$ LANGUAGE plpgsql;

CREATE TRIGGER cost_trigger BEFORE INSERT OR UPDATE ON cost
FOR EACH ROW EXECUTE PROCEDURE cost_period_check();

Does that help?
-Mike

In response to

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message David Johnston 2013-08-27 19:33:45 Re: cross-table constraints?
Previous Message Kevin Hunter Kesling 2013-08-27 18:41:04 cross-table constraints?