cross-table constraints?

From: Kevin Hunter Kesling <kmhunte2(at)ncsu(dot)edu>
To: Postgres Novice List <pgsql-novice(at)postgresql(dot)org>
Subject: cross-table constraints?
Date: 2013-08-27 18:41:04
Message-ID: 521CF2C0.6060100@ncsu.edu
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-novice

Hello Postgres Novice Helpers,

I don't know if this is the correct place for this question, but please
redirect as appropriate.

I am trying to create a datastore for a mathematical optimization model
I've built. The math model data structure has a lot of
interdependencies and data integrity has been a constant battle over the
past year. I've been made aware that data integrity is the I in ACID,
and is exactly the reason RDBMSs were created. Great!

I've spent some time putting together a schema that appears to make
sense, taking care of such concepts as "single point of authority" and
"normalization". I've tried to normalize it as best I could, but I
don't exactly have a lot of practice designing schemas. I'm now running
into a situation where I can't create a constraint that I think I need.
My research suggests that the type of constraint I want is
colloquially known as a "cross-table constraint", and implies that I
have not properly normalized or otherwise structured my schema. So, I
would highly appreciate any pointers this list could give me.

For the sake of the schema below, let me briefly describe what the model
needs and does.

The model is an optimization model, so at it's core finds a maximum or
minimum value of an "objective function", subject to any number of
constraints. For example, a typical objective function would be "Find
the minimum cost to supply energy to this system."

The model is organized into analyses, and each analysis can have a set
of processes. I define 'process' as a technology-vintage tuple <t, v>.
For example, a technology 'e_coal' might describe an electric coal
power plant, and <e_coal, 1980> would describe a coal power plant built
in 1980.

We might define the amount of capacity installed as:

CAP[e_coal, 1980] = 100 # some number of units

The analysis also has the notion of a first period. For example, if the
analysis in question had these vintages:

1980, 1990, 2000, 2010, 2020, 2030, 2040

And period_0 (the first period), was 2010, then the model only makes
decisions about the periods 2010, 2020, and 2030.

This means that to /utilize/ the <e_coal, 1980> process, we might define
some "activity" variables for it:

ACT[2010, e_coal, 1980]
ACT[2020, e_coal, 1980]
ACT[2030, e_coal, 1980]
ACT[2040, e_coal, 1980]

The ACTivity variable is indexed by period, technology, and vintage
<p,t,v>. *Note that the set of periods (e.g., 2010, 2020, 2030, 2040)
is a subset of the set of vintages.*

Here's where I get to how this relates to my question: as the model can
only make decisions with variables that describe future time periods
(e.g., ACT[2010,...], ACT[2020,...], ACT[2030,...], ...), it makes no
sense to define any characteristics for periods prior to 2010. For
example, if the use of the <e_coal, 1980> cost different amounts in
different periods, I might have data like this:

Cost[2010, e_coal, 1980] = 0.10 # $/kWh
Cost[2020, e_coal, 1980] = 0.30 # $/kWh
Cost[2030, e_coal, 1980] = 0.60 # $/kWh
Cost[2040, e_coal, 1980] = 0.33 # $/kWh

However, what's currently to prevent a user from giving data for 2000,
or 1990? At the moment, nothing other than my application logic. For
example, consider this (minimized) schema:

CREATE TABLE analysis (
id INTEGER NOT NULL PRIMARY KEY,
name VARCHAR(32767) NOT NULL UNIQUE,
period_0 INTEGER NOT NULL
);

CREATE TABLE vintage (
id INTEGER NOT NULL PRIMARY KEY,
analysis_id INTEGER NOT NULL REFERENCES analysis (id),
vintage INTEGER NOT NULL
);

CREATE TABLE technology (
id integer NOT NULL PRIMARY KEY,
name varchar(32767) NOT NULL UNIQUE
);

CREATE TABLE process (
id INTEGER NOT NULL PRIMARY KEY,
analysis_id INTEGER NOT NULL REFERENCES analysis (id),
technology_id INTEGER NOT NULL REFERENCES technology (id),
vintage_id INTEGER NOT NULL REFERENCES vintage (id),

UNIQUE (analysis_id, technology_id, vintage_id)
);

CREATE TABLE cost (
id integer NOT NULL PRIMARY KEY,
period_id integer NOT NULL REFERENCES vintage (id),
process_id integer NOT NULL REFERENCES process (id),
value real NOT NULL,

UNIQUE (period_id, process_id)
);

Ideally, I'd have a constraint on the 'cost' table that required that
the period referenced by period_id was not less than the associated
analysis period_0. I would like any INSERT statement that mimicks this
printf format string would fail:

INSERT INTO cost (period_id, process_id, value)
VALUES (%d, %d, %f);

(where, period_id maps to a vintage.vintage that is less than
analysis.period_0)

What would be a good way to reorganize my schema to enable a constraint
that effectively limits the allowable rows of cost, subject to the value?

Many thanks (and apologies for the long post!),

Kevin

Responses

Browse pgsql-novice by date

  From Date Subject
Next Message Michael Swierczek 2013-08-27 19:05:53 Re: cross-table constraints?
Previous Message Charles Sheridan 2013-08-27 14:17:16 CTAGS for PL/pgSQL ?